# Reddit Scraper - $0.75/1k (`santamaria-automations/reddit-scraper`) Actor

Scrape Reddit posts and comments from any subreddit, search query, or user profile. Returns title, author, full post text, external link URL, comment bodies, subreddit, and timestamps. No login required. Pay-per-result: only $0.75 per 1,000 items.

- **URL**: https://apify.com/santamaria-automations/reddit-scraper.md
- **Developed by:** [Ale](https://apify.com/santamaria-automations) (community)
- **Categories:** Social media, News, Lead generation
- **Stats:** 18 total users, 11 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $0.75 / 1,000 items

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Scraper

Scrape [Reddit](https://www.reddit.com/) posts and comments from any subreddit. Extract titles, scores, comment text, authors, and nested reply threads at scale. No API key or login required.

### What It Does

Fetches posts from one or more subreddits using Reddit's public Atom feeds (`/.rss`). Optionally fetches comments for each post. Posts and comments are returned as separate items in the dataset.

### Use with AI Agents (MCP)

Connect this actor to any MCP-compatible AI client — Claude Desktop, Claude.ai, Cursor, VS Code, LangChain, LlamaIndex, or custom agents.

**Apify MCP server URL:**

````

https://mcp.apify.com?tools=santamaria-automations/reddit-scraper

````

**Example prompt once connected:**

> "Use `reddit-scraper` to get the top 50 posts from r/MachineLearning this week with comments. Return results as a table showing title, score, and comment count."

### Features

- **Multi-subreddit** — scrape multiple subreddits in a single run
- **Comments included** — fetch comments for each post (flat list, see notes below)
- **Search** — query Reddit-wide, sort by relevance / top / new / hot
- **User profiles** — pull a user's posts and comments
- **Sorting options** — hot, new, top, rising
- **Deduplication** — the same post is never returned twice
- **Pagination** — uses Reddit's `after` cursor to collect more than 100 posts
- **Anti-bot resilient** — TLS fingerprinted sessions, rotating proxy IPs
- **Rate-limit aware** — paces requests well under Reddit's public limits
- **No credentials needed** — uses Reddit's public Atom feeds
- **Pay-per-result** — only pay for items you receive

### Data Extracted

#### Posts (type = "post")

| Field | Example |
|-------|---------|
| `id` | `"abc123"` |
| `type` | `"post"` |
| `subreddit` | `"programming"` |
| `title` | `"Show HN: I built a Go-based Reddit scraper"` |
| `author` | `"john_doe"` |
| `text` | `"Full text of a self post..."` |
| `url` | `"https://github.com/example/repo"` |
| `score` | `0` (see notes) |
| `num_comments` | `null` (see notes) |
| `is_stickied` | `false` |
| `created_utc` | `"2026-04-25T10:00:00Z"` |
| `reddit_url` | `"https://www.reddit.com/r/programming/comments/..."` |
| `scraped_at` | `"2026-04-25T10:30:00Z"` |

#### Comments (type = "comment")

| Field | Example |
|-------|---------|
| `id` | `"xyz789"` |
| `type` | `"comment"` |
| `subreddit` | `"programming"` |
| `author` | `"helpful_user"` |
| `text` | `"Great project! Have you considered..."` |
| `score` | `0` (see notes) |
| `parent_id` | `null` (see notes) |
| `post_id` | `"abc123"` |
| `post_title` | `"Show HN: I built a Go-based Reddit scraper"` |
| `is_stickied` | `false` |
| `created_utc` | `"2026-04-25T11:15:00Z"` |
| `reddit_url` | `"https://www.reddit.com/r/programming/comments/.../xyz789/"` |
| `scraped_at` | `"2026-04-25T11:30:00Z"` |

**Notes on `score`, `num_comments`, and `parent_id`:** Reddit's public Atom feeds do not expose vote totals, comment counts, or comment parent-IDs. These fields are returned as `0` / `null`. For everything else (title, author, body text, URLs, timestamps), the data is identical to what you'd see on the site.

### Pricing

Pay-per-result pricing. You only pay for items you receive.

| Event | Price | Description |
|-------|-------|-------------|
| Actor start | $0.005 | One-time container startup fee |
| Item scraped | $0.75 / 1,000 | Each post or comment returned |

**Examples:**
- 100 posts (no comments) = **$0.08** total ($0.005 + $0.075)
- 100 posts + 500 comments = **$0.455** total ($0.005 + $0.45)
- 1,000 posts + 5,000 comments = **$4.505** total ($0.005 + $4.50)

6x cheaper than competing Reddit scrapers ($5/1k+). No monthly fees. No minimum spend.

No monthly fees. No minimum spend.

### Input

| Field | Type | Description | Default |
|-------|------|-------------|---------|
| `subreddits` | string[] | Subreddit names to scrape (no `r/` prefix) | `["programming"]` |
| `searchQuery` | string | Search Reddit for matching posts (overrides subreddits) | — |
| `usernames` | string[] | Scrape all posts/comments from these users (no `u/` prefix) | — |
| `sort` | string | `hot`, `new`, `top`, `rising` (or `relevance` for search) | `hot` |
| `includeComments` | boolean | Fetch comments for each post | `false` |
| `commentDepth` | integer | Nesting depth: 1=top-level, 2=+replies, up to 5 | `3` |
| `maxCommentsPerPost` | integer | Max comments per post. `0` = unlimited. | `100` |
| `maxResults` | integer | Max posts to return (across all subreddits). `0` = unlimited. | `100` |
| `proxyConfiguration` | object | Apify proxy settings | Auto |

### Usage Examples

#### Scrape hot posts from multiple subreddits

```json
{
  "subreddits": ["programming", "python", "golang"],
  "sort": "hot",
  "maxResults": 200
}
````

#### Get top posts with full comment threads

```json
{
  "subreddits": ["MachineLearning"],
  "sort": "top",
  "includeComments": true,
  "commentDepth": 3,
  "maxCommentsPerPost": 50,
  "maxResults": 100
}
```

#### Scrape new posts without comments

```json
{
  "subreddits": ["startups", "SaaS"],
  "sort": "new",
  "maxResults": 500
}
```

#### Search Reddit

```json
{
  "searchQuery": "artificial intelligence startup",
  "sort": "top",
  "maxResults": 50
}
```

#### Scrape a user's activity (experimental)

```json
{
  "usernames": ["AutoModerator"],
  "maxResults": 50
}
```

Note: Reddit applies stricter rate limiting on user profile pages. Some users may return fewer results.

#### Deep comment mining from a single subreddit

```json
{
  "subreddits": ["AskReddit"],
  "sort": "hot",
  "includeComments": true,
  "commentDepth": 5,
  "maxCommentsPerPost": 200,
  "maxResults": 20
}
```

### Output

Results are exported to the default dataset. Posts and comments are interleaved — each post is followed by its comments (if `includeComments` is enabled). Use the `type` field to filter posts vs comments.

Export to JSON, CSV, Excel, or connect via the Apify API.

### FAQ

**Do I need a Reddit account or API key?**
No. This scraper uses Reddit's public Atom feeds which are accessible without authentication.

**What is the rate limit?**
The scraper paces requests at ~1 every 1.5 seconds. With proxy rotation enabled (default), you can run multiple actors in parallel without hitting limits.

**Can I scrape private subreddits?**
No. Only public subreddits and posts visible without logging in are accessible.

**How are comments structured?**
Each comment is a separate output item with `type: "comment"`. The `post_id` and `post_title` fields reference the original post. Comments come back as a flat list — the parent-comment relationship is not exposed by the Atom feed.

**Why no `score` or `num_comments`?**
Reddit removed access to its `/.json` endpoints in June 2026. The Atom (`/.rss`) feed is the only public surface that still works without a registered API app, and it doesn't expose vote counts. If you need scores, use Reddit's official OAuth API.

**Why do I need a proxy?**
A proxy isn't required, but Reddit will rate-limit a single IP after a few quick requests. Apify's auto/datacenter proxy is sufficient — no residential proxy needed.

# Actor input Schema

## `subreddits` (type: `array`):

Subreddit names to scrape (no r/ prefix). Example: programming, python

## `searchQuery` (type: `string`):

Search Reddit for posts matching this query. Overrides subreddits.

## `usernames` (type: `array`):

Reddit usernames to scrape all posts and comments from (no u/ prefix).

## `sort` (type: `string`):

How to sort results: hot, new, top, rising (or relevance for search).

## `includeComments` (type: `boolean`):

Fetch comments for each post. Each comment is a separate output item.

## `commentDepth` (type: `integer`):

Nesting depth: 1 = top-level, 2 = + replies, up to 5.

## `maxCommentsPerPost` (type: `integer`):

Maximum comments per post. 0 = no limit.

## `maxResults` (type: `integer`):

Maximum items to return. 0 = no limit.

## `proxyConfiguration` (type: `object`):

Apify proxy settings. Datacenter proxies work fine for Reddit.

## Actor input object example

```json
{
  "subreddits": [
    "programming"
  ],
  "sort": "hot",
  "includeComments": false,
  "commentDepth": 3,
  "maxCommentsPerPost": 100,
  "maxResults": 100,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `defaultDataset` (type: `string`):

Dataset containing scraped Reddit posts and comments

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "subreddits": [
        "programming"
    ],
    "maxResults": 100
};

// Run the Actor and wait for it to finish
const run = await client.actor("santamaria-automations/reddit-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "subreddits": ["programming"],
    "maxResults": 100,
}

# Run the Actor and wait for it to finish
run = client.actor("santamaria-automations/reddit-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "subreddits": [
    "programming"
  ],
  "maxResults": 100
}' |
apify call santamaria-automations/reddit-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=santamaria-automations/reddit-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Scraper - $0.75/1k",
        "description": "Scrape Reddit posts and comments from any subreddit, search query, or user profile. Returns title, author, full post text, external link URL, comment bodies, subreddit, and timestamps. No login required. Pay-per-result: only $0.75 per 1,000 items.",
        "version": "2.3",
        "x-build-id": "X44pxqpLCS6EpBr4A"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/santamaria-automations~reddit-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-santamaria-automations-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/santamaria-automations~reddit-scraper/runs": {
            "post": {
                "operationId": "runs-sync-santamaria-automations-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/santamaria-automations~reddit-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-santamaria-automations-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "subreddits": {
                        "title": "Subreddits",
                        "type": "array",
                        "description": "Subreddit names to scrape (no r/ prefix). Example: programming, python",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search Reddit for posts matching this query. Overrides subreddits."
                    },
                    "usernames": {
                        "title": "Usernames",
                        "type": "array",
                        "description": "Reddit usernames to scrape all posts and comments from (no u/ prefix).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "sort": {
                        "title": "Sort Order",
                        "enum": [
                            "hot",
                            "new",
                            "top",
                            "rising"
                        ],
                        "type": "string",
                        "description": "How to sort results: hot, new, top, rising (or relevance for search).",
                        "default": "hot"
                    },
                    "includeComments": {
                        "title": "Include Comments",
                        "type": "boolean",
                        "description": "Fetch comments for each post. Each comment is a separate output item.",
                        "default": false
                    },
                    "commentDepth": {
                        "title": "Comment Depth",
                        "minimum": 1,
                        "maximum": 5,
                        "type": "integer",
                        "description": "Nesting depth: 1 = top-level, 2 = + replies, up to 5.",
                        "default": 3
                    },
                    "maxCommentsPerPost": {
                        "title": "Max Comments Per Post",
                        "minimum": 0,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Maximum comments per post. 0 = no limit.",
                        "default": 100
                    },
                    "maxResults": {
                        "title": "Maximum Results",
                        "minimum": 0,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum items to return. 0 = no limit.",
                        "default": 100
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Apify proxy settings. Datacenter proxies work fine for Reddit.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
