# Reddit Scraper (`solidcode/reddit-scraper`) Actor

\[💰 $1.0 / 1K] Extract posts, comments, users, and subreddits from Reddit. Provide subreddit names, search queries, or paste Reddit URLs (post / subreddit / user / search) — mix and match. Returns one row per record with a recordType discriminator.

- **URL**: https://apify.com/solidcode/reddit-scraper.md
- **Developed by:** [SolidCode](https://apify.com/solidcode) (community)
- **Categories:** Social media, Developer tools, Other
- **Stats:** 227 total users, 112 monthly users, 100.0% runs succeeded, 4 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Scraper

Extract posts, comments, users, and entire subreddits from Reddit in one run. Mix any combination of subreddit names, search keywords, and direct Reddit URLs — every row is tagged with a `recordType` so you can filter posts, comments, users, and communities in a single dataset.

### Why This Scraper?

- **Four record types in one run** — posts, comments, users, and subreddits, each with rich metadata
- **Mix and match sources** — feed it subreddit names, search keywords, and full Reddit URLs (post links, user profiles, search URLs) in any combination
- **Full history, beyond the first 1,000** — scrape any named community by `New` and page thousands of posts deep, well past Reddit's standard ~1,000-per-feed limit, in unbroken newest-to-oldest order. The same depth applies to any user's posts and comments
- **Deep comment trees** — walk every reply branch, with knobs for maximum comments per post and nesting depth
- **Date cutoffs** — scrape only posts or comments after a given date (great for daily refreshes)
- **NSFW toggle** — keep or filter over-18 content
- **Sensible defaults** — works out of the box with zero configuration, but every knob is exposed
- **Pay only for results** — no compute charges, no proxy bills, no surprises
- **Reliable JSON output** — every row uses stable, camelCase field names you can pipe straight into a database

### Use Cases

**Market Research & Competitive Intelligence**
- Track conversation volume around a brand, product, or competitor across communities
- Monitor sentiment in niche subreddits before launching a product
- Discover trending topics and rising subreddits by category

**Lead Generation & Audience Building**
- Find users asking buying-intent questions in your industry
- Build prospect lists from active contributors in target communities
- Identify niche influencers ranked by karma and engagement

**Brand Monitoring & PR**
- Catch mentions of your brand or product in real time using keyword searches
- Pull entire comment threads from posts that mention you
- Track sentiment and response rates over weeks or months

**Academic Research & Data Science**
- Build labeled datasets for natural-language processing or sentiment models
- Study community dynamics, moderation, and user behavior at scale
- Capture longitudinal snapshots of subreddits with date cutoffs

**Content & SEO**
- Discover the most-asked questions in any niche
- Mine high-engagement post titles for content ideas
- Track which external links perform best in target communities

### Getting Started

#### Scrape posts from a subreddit

The simplest possible run — pull the latest 100 posts from `r/programming`:

```json
{
    "subreddits": ["programming"],
    "sort": "new",
    "maxItems": 100
}
````

#### Pull a community's full history

Point at a specific subreddit name (not the `popular`/`all` aggregators), sort by `New`, and set a high `maxItems` to page far past Reddit's standard ~1,000-per-feed limit — newest to oldest, with no duplicates:

```json
{
    "subreddits": ["worldnews"],
    "sort": "new",
    "maxItems": 10000,
    "skipComments": true
}
```

#### Search across Reddit

Find the top posts of the past month matching a keyword:

```json
{
    "searches": ["best espresso machine"],
    "searchPosts": true,
    "sort": "top",
    "time": "month",
    "maxItems": 200
}
```

#### Get every comment on a specific post

Paste any Reddit post URL — the actor pulls the post and walks the full comment tree:

```json
{
    "startUrls": [
        "https://www.reddit.com/r/AskReddit/comments/1abc234/whats_your_favourite_book/"
    ],
    "skipComments": false,
    "maxComments": 500,
    "maxCommentDepth": 10
}
```

#### Scrape a user's posts and comments

Pull a user's profile, every post they've submitted, and every comment they've made:

```json
{
    "startUrls": ["https://www.reddit.com/user/spez/"],
    "sort": "new",
    "maxItems": 1000
}
```

#### Daily refresh of a subreddit

Combine a subreddit source with a date cutoff to grab only what's new since yesterday:

```json
{
    "subreddits": ["wallstreetbets"],
    "sort": "new",
    "postDateLimit": "2025-04-24",
    "skipComments": true,
    "maxItems": 0
}
```

### Input Reference

#### Sources

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `subreddits` | string\[] | `["popular"]` | Subreddit names to scrape (the `r/` prefix is optional). Each subreddit is fetched independently. |
| `searches` | string\[] | `[]` | Keywords to search across Reddit. Each keyword runs as its own query. |
| `startUrls` | URL\[] | `[]` | Reddit URLs to scrape directly — accepts subreddit, post, user-profile, and search URLs. Mix any types. |

#### Search Options

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `searchCommunityName` | string | | Restrict every search keyword to a single subreddit (e.g. `programming`). Leave empty to search all of Reddit. |
| `searchPosts` | boolean | `true` | Include matching posts in keyword search results. |
| `searchComments` | boolean | `false` | Include matching comments in keyword search results (see Tips — Reddit's comment-search index is limited). |
| `searchCommunities` | boolean | `false` | Include matching subreddits in keyword search results. |
| `searchUsers` | boolean | `false` | Include matching user profiles in keyword search results. |

#### Sort & Filter

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `sort` | string | `"new"` | Result ordering: `New`, `Hot`, `Top`, `Rising`, `Most comments`, or `Relevance (search only)`. |
| `time` | string | `"all"` | Date window for `Top` sort and keyword searches: `All time`, `Past hour`, `Past 24 hours`, `Past week`, `Past month`, `Past year`. |
| `includeNSFW` | boolean | `false` | When off (the default), posts and subreddits flagged as over-18 are filtered out. Turn on to keep adult-tagged content. |
| `postDateLimit` | string | | Earliest post date. Accepts a calendar date (`2025-04-01`), an ISO timestamp (`2025-04-01T12:00:00Z`), or a relative value (`7d`, `2 weeks`, `48 hours`, `1 month`). Only posts on or after this point are kept; when `sort=new`, pagination stops as soon as older posts are reached. |
| `commentDateLimit` | string | | Earliest comment date. Accepts a calendar date, an ISO timestamp, or a relative value (`7d`, `2 weeks`, `48 hours`). Older comments are dropped. |

#### What to Extract

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `skipComments` | boolean | `true` | When scraping a post (via URL or search), skip its comment tree and keep only the post metadata. On by default to keep runs fast — turn off to also fetch each post's comment thread (this multiplies request count and runtime). |
| `skipUserPosts` | boolean | `false` | When scraping a user profile, skip their submitted posts. |
| `skipUserComments` | boolean | `false` | When scraping a user profile, skip their comments. |
| `skipCommunityInfo` | boolean | `false` | When scraping a subreddit, omit the metadata row (member count, description) and emit only the posts. |

#### Limits

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `maxItems` | integer | `100` | Cap on total rows in the output dataset across every source. Use `0` for **truly unlimited** — keeps paging until each source's full history is exhausted (expect large datasets and longer runs). A named subreddit or user sorted by `New` returns full history well beyond 1,000 — to reach high targets, name specific communities and sort by `New` (the `popular`/`all` aggregators and keyword searches stay capped at ~1,000). |
| `maxComments` | integer | `100` | Maximum comments to fetch from each post. Use `0` to fetch the entire comment tree (hard upper bound 1,000). |
| `maxCommentDepth` | integer | `10` | Maximum nesting depth when walking a comment tree (`0` = top-level only, hard upper bound 20). |

### Output

Every record carries a `recordType` field — `post`, `comment`, `user`, or `subreddit` — so you can filter the dataset down to just what you need.

#### Post

```json
{
    "recordType": "post",
    "id": "1abc234",
    "fullId": "t3_1abc234",
    "url": "https://www.reddit.com/r/programming/comments/1abc234/rust_2025_release/",
    "createdAt": "2025-04-20T14:32:11Z",
    "scrapedAt": "2025-04-25T09:01:42Z",
    "sourceQuery": "r/programming",
    "title": "Rust 2025 release notes",
    "text": "The Rust team just announced...",
    "subreddit": "programming",
    "author": "rustacean99",
    "score": 4321,
    "upvoteRatio": 0.97,
    "numComments": 412,
    "permalink": "/r/programming/comments/1abc234/rust_2025_release/",
    "linkUrl": "https://blog.rust-lang.org/2025/04/20/Rust-2025.html",
    "domain": "blog.rust-lang.org",
    "flair": "News",
    "isNsfw": false,
    "isSpoiler": false,
    "isStickied": false,
    "isLocked": false,
    "isVideo": false,
    "thumbnail": "https://b.thumbs.redditmedia.com/abc.jpg"
}
```

| Field | Type | Description |
|-------|------|-------------|
| `recordType` | string | Always `"post"` |
| `id` | string | Reddit short ID |
| `fullId` | string | Reddit fullname (`t3_<id>`) |
| `url` | string | Public Reddit URL |
| `createdAt` | string | ISO-8601 timestamp |
| `scrapedAt` | string | ISO-8601 timestamp of extraction |
| `sourceQuery` | string | Which input source produced this row |
| `title` | string | Post title |
| `text` | string | Self-text body (when present) |
| `subreddit` | string | Subreddit name |
| `author` | string | Username of the poster |
| `score` | number | Net upvotes |
| `upvoteRatio` | number | Upvote ratio (0.0–1.0) |
| `numComments` | number | Total comment count |
| `permalink` | string | Path on reddit.com |
| `linkUrl` | string | Outbound link for link posts |
| `domain` | string | Hostname of the linked URL |
| `flair` | string | Post flair text |
| `isNsfw` | boolean | Marked over-18 |
| `isSpoiler` | boolean | Marked as spoiler |
| `isStickied` | boolean | Pinned to subreddit |
| `isLocked` | boolean | Comments locked |
| `isVideo` | boolean | Native Reddit video |
| `media` | object | Media payload (video, gallery, or preview) |
| `thumbnail` | string | Thumbnail image URL |

#### Comment

```json
{
    "recordType": "comment",
    "id": "k9j2x4",
    "fullId": "t1_k9j2x4",
    "url": "https://www.reddit.com/r/programming/comments/1abc234/_/k9j2x4/",
    "createdAt": "2025-04-20T15:11:08Z",
    "scrapedAt": "2025-04-25T09:01:42Z",
    "sourceQuery": "r/programming",
    "body": "This is huge for the embedded space.",
    "author": "embeddev",
    "score": 187,
    "subreddit": "programming",
    "postId": "t3_1abc234",
    "postTitle": "Rust 2025 release notes",
    "parentId": "t3_1abc234",
    "depth": 0,
    "isStickied": false,
    "isSubmitter": false,
    "flair": "Senior Engineer"
}
```

| Field | Type | Description |
|-------|------|-------------|
| `recordType` | string | Always `"comment"` |
| `id` | string | Reddit short ID |
| `fullId` | string | Reddit fullname (`t1_<id>`) |
| `url` | string | Direct link to the comment |
| `createdAt` | string | ISO-8601 timestamp |
| `scrapedAt` | string | ISO-8601 timestamp of extraction |
| `sourceQuery` | string | Which input source produced this row |
| `body` | string | Comment text |
| `author` | string | Comment author username |
| `score` | number | Net upvotes |
| `subreddit` | string | Subreddit name |
| `postId` | string | Parent post fullname |
| `postTitle` | string | Parent post title |
| `parentId` | string | Parent comment or post fullname |
| `depth` | number | Nesting depth (0 = top-level) |
| `permalink` | string | Path on reddit.com |
| `isStickied` | boolean | Pinned to top of thread |
| `isSubmitter` | boolean | Author is the original poster |
| `flair` | string | Author flair text |

#### User

```json
{
    "recordType": "user",
    "id": "abc123",
    "fullId": "t2_abc123",
    "url": "https://www.reddit.com/user/rustacean99/",
    "createdAt": "2014-08-12T03:22:55Z",
    "scrapedAt": "2025-04-25T09:01:42Z",
    "sourceQuery": "u/rustacean99",
    "username": "rustacean99",
    "displayName": "Rust Enthusiast",
    "linkKarma": 28412,
    "commentKarma": 91204,
    "totalKarma": 119616,
    "isGold": true,
    "isMod": false,
    "verified": true,
    "description": "Open-source contributor. Rust, embedded, systems.",
    "iconUrl": "https://styles.redditmedia.com/abc.png",
    "bannerUrl": "https://styles.redditmedia.com/banner.jpg"
}
```

| Field | Type | Description |
|-------|------|-------------|
| `recordType` | string | Always `"user"` |
| `id` | string | Reddit short ID |
| `fullId` | string | Reddit fullname (`t2_<id>`) |
| `url` | string | Public profile URL |
| `createdAt` | string | Account creation timestamp |
| `scrapedAt` | string | ISO-8601 timestamp of extraction |
| `sourceQuery` | string | Which input source produced this row |
| `username` | string | Reddit username |
| `displayName` | string | Profile display name |
| `linkKarma` | number | Post karma |
| `commentKarma` | number | Comment karma |
| `totalKarma` | number | Combined karma |
| `isGold` | boolean | Premium subscriber |
| `isMod` | boolean | Moderates one or more subreddits |
| `verified` | boolean | Email-verified account |
| `description` | string | Profile bio |
| `iconUrl` | string | Avatar image URL |
| `bannerUrl` | string | Profile banner URL |

#### Subreddit

```json
{
    "recordType": "subreddit",
    "id": "2qh16",
    "fullId": "t5_2qh16",
    "url": "https://www.reddit.com/r/programming/",
    "createdAt": "2008-01-25T03:00:24Z",
    "scrapedAt": "2025-04-25T09:01:42Z",
    "sourceQuery": "r/programming",
    "name": "programming",
    "displayName": "r/programming",
    "title": "Computer Programming",
    "publicDescription": "Computer Programming",
    "subscribers": 6342118,
    "activeUsers": 2841,
    "isNsfw": false,
    "lang": "en",
    "iconUrl": "https://styles.redditmedia.com/community.png",
    "bannerUrl": "https://styles.redditmedia.com/banner.jpg"
}
```

| Field | Type | Description |
|-------|------|-------------|
| `recordType` | string | Always `"subreddit"` |
| `id` | string | Reddit short ID |
| `fullId` | string | Reddit fullname (`t5_<id>`) |
| `url` | string | Public subreddit URL |
| `createdAt` | string | Subreddit creation timestamp |
| `scrapedAt` | string | ISO-8601 timestamp of extraction |
| `sourceQuery` | string | Which input source produced this row |
| `name` | string | Subreddit name without prefix |
| `displayName` | string | Display name with `r/` prefix |
| `title` | string | Full title |
| `description` | string | Long-form description (markdown) |
| `publicDescription` | string | Short tagline |
| `subscribers` | number | Member count |
| `activeUsers` | number | Currently online members |
| `isNsfw` | boolean | Over-18 community |
| `lang` | string | Primary language code |
| `iconUrl` | string | Community icon URL |
| `bannerUrl` | string | Community banner URL |

### Tips for Best Results

- **Comment search returns recent posts, not standalone comments.** Reddit's public comment-search index has been limited since 2021, so the `Search returns comments` toggle may return zero rows for many queries. To find comments containing a keyword reliably, search for posts and let the actor walk each post's comment tree.
- **Named communities go far past 1,000 — aggregators don't.** A specific subreddit (e.g. `worldnews`) sorted by `New` returns its full history, tens of thousands of posts deep in exact newest-to-oldest order, and the same goes for any user's posts and comments. The `popular` and `all` feeds are aggregators that genuinely cap at ~1,000 — name the communities you care about to go deeper. (Keyword searches are also capped at ~1,000.)
- **Very old posts show historical vote counts.** Scores and comment counts on deep-history posts reflect a snapshot from close to when they were first published, so they can read lower than today's live totals. The most recent results always carry live numbers.
- **Use `sort: "new"` with `postDateLimit` for clean refreshes.** With this combination the actor stops paginating as soon as it hits the cutoff date — no wasted requests.
- **Set `maxComments: 0` only when you really need every reply.** Popular AskReddit threads can have 50,000+ comments; capping at a few hundred is usually plenty.
- **Mix sources in one run.** Combine `subreddits`, `searches`, and `startUrls` in a single input to consolidate data into one dataset and pay one bill.
- **NSFW is filtered out by default.** Over-18 posts and subreddits are dropped unless you turn `includeNSFW` on.

### Pricing

**$1.00 per 1,000 results** — pay only for the rows you receive.

| Results | Cost |
|---------|------|
| 100 | $0.10 |
| 1,000 | $1.00 |
| 10,000 | $10.00 |

**No compute charges — you only pay per result returned.** Storage, proxies, and platform fees are included.

### Integrations

Export results in JSON, CSV, Excel, XML, or RSS. Connect to 1,500+ apps:

- **Apify API** — Full programmatic access to runs and datasets
- **Webhooks** — Get notified the moment a run completes
- **Google Sheets** — Direct spreadsheet export
- **Zapier** / **Make** / **n8n** — Workflow automation across thousands of apps
- **Slack** / **Email** — Notifications on new results

### Legal & Ethical Use

This actor extracts publicly available information from Reddit for legitimate research, monitoring, and analytics purposes. You are responsible for complying with Reddit's Terms of Service, the Reddit User Agreement, and all applicable laws including the GDPR, CCPA, and other privacy regulations.

Do not use the extracted data for spam, harassment, doxxing, training models that violate Reddit's policies, or any unlawful purpose. Avoid collecting personal data on private individuals, and respect any user who has requested removal of their content. When in doubt, treat the data the same way Reddit's own product would.

# Actor input Schema

## `subreddits` (type: `array`):

Subreddit NAMES only — e.g. 'askspain' or 'r/askspain', one per line (the 'r/' prefix is optional). Do NOT paste full links here; put any reddit.com/... URL in the 'Reddit URLs' field below instead. Each subreddit is fetched independently using the sort and time settings below. Tip: a specific subreddit name sorted by New returns full history (far past 1,000 posts), while the r/popular and r/all aggregator feeds cap at about 1,000 — name the communities you care about to collect more. Leave empty if you only want search keywords or URLs. If every source is empty, the run defaults to r/popular.

## `searches` (type: `array`):

Search TERMS only — e.g. 'climate change' or 'best espresso machine', one per line. Do NOT paste a search URL here; put any reddit.com/search?q=... or reddit.com/r/.../search?q=... link in the 'Reddit URLs' field below instead. Each keyword runs independently. Use the 'Search returns' toggles below to choose posts, comments, communities, users, or any mix.

## `startUrls` (type: `array`):

Paste Reddit links here (this is the right field for any URL). Accepts subreddit URLs (reddit.com/r/...), post URLs (reddit.com/r/.../comments/...), user profiles (reddit.com/user/...), and search URLs (reddit.com/search?q=... or reddit.com/r/<sub>/search?q=...). Mix any types in one list. Tip: a search URL already carries its own keyword, sort, and time window — those override the global Sort/Time settings for that link.

## `searchCommunityName` (type: `string`):

Optional. If set, every search keyword above will be restricted to this single subreddit (e.g., 'programming'). Leave empty to search all of Reddit. The 'r/' prefix is optional.

## `searchPosts` (type: `boolean`):

Include matching posts in keyword search results.

## `searchComments` (type: `boolean`):

Include matching comments in keyword search results. Note: Reddit's public comment-search index has been limited since 2021 — this option may return zero rows for many queries. To extract comments containing a keyword reliably, search for posts and let the actor walk each post's comment tree (see Maximum comments per post).

## `searchCommunities` (type: `boolean`):

Include matching subreddits in keyword search results.

## `searchUsers` (type: `boolean`):

Include matching user profiles in keyword search results.

## `sort` (type: `string`):

Result ordering. 'Hot' surfaces trending posts, 'New' shows the most recent, 'Top' shows the highest-scoring within the time window, 'Rising' is gaining momentum, 'Controversial' surfaces polarizing posts, 'Comments' is most-commented, 'Relevance' applies only to keyword searches. Pick 'New' to collect the most data — sorting a named subreddit (or user) by New pages through its full history, far past the ~1,000 limit that the other sorts hit.

## `time` (type: `string`):

Date window for 'Top' sort and for keyword searches. Other sorts ignore this. 'All time' returns the unfiltered ranking.

## `includeNSFW` (type: `boolean`):

When off (the default), posts and subreddits flagged as over-18 are filtered out of the results — safer for general use. Turn on if you specifically want adult-tagged content.

## `postDateLimit` (type: `string`):

Only include posts created on or after this date. Accepts a calendar date (2025-04-01), an ISO timestamp (2025-04-01T12:00:00Z), or a relative value such as '7d', '2 weeks', '48 hours', '1 month'. When sort=new, pagination stops as soon as older posts are reached. Leave empty for no limit.

## `commentDateLimit` (type: `string`):

Only include comments created on or after this date. Accepts a calendar date (2025-04-01), an ISO timestamp (2025-04-01T12:00:00Z), or a relative value such as '7d', '2 weeks', '48 hours'. Leave empty for no limit.

## `skipComments` (type: `boolean`):

When scraping a post (via URL or search), do NOT also fetch its comment tree. On by default to keep runs fast — turn off if you also want each post's comment thread (this multiplies request count and runtime).

## `skipUserPosts` (type: `boolean`):

When scraping a user profile, skip their submitted posts.

## `skipUserComments` (type: `boolean`):

When scraping a user profile, skip their comments.

## `skipCommunityInfo` (type: `boolean`):

When scraping a subreddit, omit the metadata row (member count, description, etc.) and only emit the posts.

## `maxItems` (type: `integer`):

Cap on total rows in the output dataset across every source. Use 0 for truly unlimited — the run keeps paging until each source's full history is exhausted (a named subreddit or user sorted by New can run very deep, so expect large datasets and longer runs). To reach high targets, name a specific subreddit (or user) and sort by New — that returns full history, well beyond 1,000. The r/popular and r/all aggregator feeds and keyword searches stay capped at about 1,000, so prefer specific community names when you want depth.

## `maxComments` (type: `integer`):

How many comments to fetch from each post. Use 0 to fetch the entire comment tree (can balloon for popular threads). Hard upper bound is 1,000.

## `maxCommentDepth` (type: `integer`):

Maximum nesting depth when walking a comment tree (0 = only top-level comments, 10 = ten levels of replies). Caps how deep the actor follows nested reply chains. Hard upper bound is 20.

## Actor input object example

```json
{
  "subreddits": [
    "popular"
  ],
  "searches": [],
  "startUrls": [],
  "searchPosts": true,
  "searchComments": false,
  "searchCommunities": false,
  "searchUsers": false,
  "sort": "new",
  "time": "all",
  "includeNSFW": false,
  "skipComments": true,
  "skipUserPosts": false,
  "skipUserComments": false,
  "skipCommunityInfo": false,
  "maxItems": 100,
  "maxComments": 100,
  "maxCommentDepth": 10
}
```

# Actor output Schema

## `overview` (type: `string`):

Every scraped Reddit row in one table — filterable by recordType.

## `posts` (type: `string`):

Just the post rows, ranked by score and comment count.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "subreddits": [
        "popular"
    ],
    "searches": [],
    "startUrls": [],
    "searchPosts": true,
    "searchComments": false,
    "searchCommunities": false,
    "searchUsers": false,
    "sort": "new",
    "time": "all",
    "includeNSFW": false,
    "skipComments": true,
    "skipUserPosts": false,
    "skipUserComments": false,
    "skipCommunityInfo": false,
    "maxItems": 100,
    "maxComments": 100,
    "maxCommentDepth": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("solidcode/reddit-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "subreddits": ["popular"],
    "searches": [],
    "startUrls": [],
    "searchPosts": True,
    "searchComments": False,
    "searchCommunities": False,
    "searchUsers": False,
    "sort": "new",
    "time": "all",
    "includeNSFW": False,
    "skipComments": True,
    "skipUserPosts": False,
    "skipUserComments": False,
    "skipCommunityInfo": False,
    "maxItems": 100,
    "maxComments": 100,
    "maxCommentDepth": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("solidcode/reddit-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "subreddits": [
    "popular"
  ],
  "searches": [],
  "startUrls": [],
  "searchPosts": true,
  "searchComments": false,
  "searchCommunities": false,
  "searchUsers": false,
  "sort": "new",
  "time": "all",
  "includeNSFW": false,
  "skipComments": true,
  "skipUserPosts": false,
  "skipUserComments": false,
  "skipCommunityInfo": false,
  "maxItems": 100,
  "maxComments": 100,
  "maxCommentDepth": 10
}' |
apify call solidcode/reddit-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=solidcode/reddit-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Scraper",
        "description": "[💰 $1.0 / 1K] Extract posts, comments, users, and subreddits from Reddit. Provide subreddit names, search queries, or paste Reddit URLs (post / subreddit / user / search) — mix and match. Returns one row per record with a recordType discriminator.",
        "version": "1.1",
        "x-build-id": "dbB7qOOhD50wdvGgT"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/solidcode~reddit-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-solidcode-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/solidcode~reddit-scraper/runs": {
            "post": {
                "operationId": "runs-sync-solidcode-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/solidcode~reddit-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-solidcode-reddit-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "subreddits": {
                        "title": "Subreddits",
                        "type": "array",
                        "description": "Subreddit NAMES only — e.g. 'askspain' or 'r/askspain', one per line (the 'r/' prefix is optional). Do NOT paste full links here; put any reddit.com/... URL in the 'Reddit URLs' field below instead. Each subreddit is fetched independently using the sort and time settings below. Tip: a specific subreddit name sorted by New returns full history (far past 1,000 posts), while the r/popular and r/all aggregator feeds cap at about 1,000 — name the communities you care about to collect more. Leave empty if you only want search keywords or URLs. If every source is empty, the run defaults to r/popular.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searches": {
                        "title": "Search Keywords",
                        "type": "array",
                        "description": "Search TERMS only — e.g. 'climate change' or 'best espresso machine', one per line. Do NOT paste a search URL here; put any reddit.com/search?q=... or reddit.com/r/.../search?q=... link in the 'Reddit URLs' field below instead. Each keyword runs independently. Use the 'Search returns' toggles below to choose posts, comments, communities, users, or any mix.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "startUrls": {
                        "title": "Reddit URLs",
                        "type": "array",
                        "description": "Paste Reddit links here (this is the right field for any URL). Accepts subreddit URLs (reddit.com/r/...), post URLs (reddit.com/r/.../comments/...), user profiles (reddit.com/user/...), and search URLs (reddit.com/search?q=... or reddit.com/r/<sub>/search?q=...). Mix any types in one list. Tip: a search URL already carries its own keyword, sort, and time window — those override the global Sort/Time settings for that link.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchCommunityName": {
                        "title": "Restrict Search to a Community",
                        "type": "string",
                        "description": "Optional. If set, every search keyword above will be restricted to this single subreddit (e.g., 'programming'). Leave empty to search all of Reddit. The 'r/' prefix is optional."
                    },
                    "searchPosts": {
                        "title": "Search returns posts",
                        "type": "boolean",
                        "description": "Include matching posts in keyword search results.",
                        "default": true
                    },
                    "searchComments": {
                        "title": "Search returns comments",
                        "type": "boolean",
                        "description": "Include matching comments in keyword search results. Note: Reddit's public comment-search index has been limited since 2021 — this option may return zero rows for many queries. To extract comments containing a keyword reliably, search for posts and let the actor walk each post's comment tree (see Maximum comments per post).",
                        "default": false
                    },
                    "searchCommunities": {
                        "title": "Search returns communities",
                        "type": "boolean",
                        "description": "Include matching subreddits in keyword search results.",
                        "default": false
                    },
                    "searchUsers": {
                        "title": "Search returns users",
                        "type": "boolean",
                        "description": "Include matching user profiles in keyword search results.",
                        "default": false
                    },
                    "sort": {
                        "title": "Sort",
                        "enum": [
                            "new",
                            "hot",
                            "top",
                            "rising",
                            "controversial",
                            "relevance",
                            "comments"
                        ],
                        "type": "string",
                        "description": "Result ordering. 'Hot' surfaces trending posts, 'New' shows the most recent, 'Top' shows the highest-scoring within the time window, 'Rising' is gaining momentum, 'Controversial' surfaces polarizing posts, 'Comments' is most-commented, 'Relevance' applies only to keyword searches. Pick 'New' to collect the most data — sorting a named subreddit (or user) by New pages through its full history, far past the ~1,000 limit that the other sorts hit.",
                        "default": "new"
                    },
                    "time": {
                        "title": "Time window",
                        "enum": [
                            "all",
                            "hour",
                            "day",
                            "week",
                            "month",
                            "year"
                        ],
                        "type": "string",
                        "description": "Date window for 'Top' sort and for keyword searches. Other sorts ignore this. 'All time' returns the unfiltered ranking.",
                        "default": "all"
                    },
                    "includeNSFW": {
                        "title": "Include NSFW content",
                        "type": "boolean",
                        "description": "When off (the default), posts and subreddits flagged as over-18 are filtered out of the results — safer for general use. Turn on if you specifically want adult-tagged content.",
                        "default": false
                    },
                    "postDateLimit": {
                        "title": "Earliest post date",
                        "type": "string",
                        "description": "Only include posts created on or after this date. Accepts a calendar date (2025-04-01), an ISO timestamp (2025-04-01T12:00:00Z), or a relative value such as '7d', '2 weeks', '48 hours', '1 month'. When sort=new, pagination stops as soon as older posts are reached. Leave empty for no limit."
                    },
                    "commentDateLimit": {
                        "title": "Earliest comment date",
                        "type": "string",
                        "description": "Only include comments created on or after this date. Accepts a calendar date (2025-04-01), an ISO timestamp (2025-04-01T12:00:00Z), or a relative value such as '7d', '2 weeks', '48 hours'. Leave empty for no limit."
                    },
                    "skipComments": {
                        "title": "Skip comments on posts",
                        "type": "boolean",
                        "description": "When scraping a post (via URL or search), do NOT also fetch its comment tree. On by default to keep runs fast — turn off if you also want each post's comment thread (this multiplies request count and runtime).",
                        "default": true
                    },
                    "skipUserPosts": {
                        "title": "Skip user's posts",
                        "type": "boolean",
                        "description": "When scraping a user profile, skip their submitted posts.",
                        "default": false
                    },
                    "skipUserComments": {
                        "title": "Skip user's comments",
                        "type": "boolean",
                        "description": "When scraping a user profile, skip their comments.",
                        "default": false
                    },
                    "skipCommunityInfo": {
                        "title": "Skip subreddit info row",
                        "type": "boolean",
                        "description": "When scraping a subreddit, omit the metadata row (member count, description, etc.) and only emit the posts.",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Maximum results (total)",
                        "minimum": 0,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Cap on total rows in the output dataset across every source. Use 0 for truly unlimited — the run keeps paging until each source's full history is exhausted (a named subreddit or user sorted by New can run very deep, so expect large datasets and longer runs). To reach high targets, name a specific subreddit (or user) and sort by New — that returns full history, well beyond 1,000. The r/popular and r/all aggregator feeds and keyword searches stay capped at about 1,000, so prefer specific community names when you want depth.",
                        "default": 100
                    },
                    "maxComments": {
                        "title": "Maximum comments per post",
                        "minimum": 0,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "How many comments to fetch from each post. Use 0 to fetch the entire comment tree (can balloon for popular threads). Hard upper bound is 1,000.",
                        "default": 100
                    },
                    "maxCommentDepth": {
                        "title": "Maximum comment depth",
                        "minimum": 0,
                        "maximum": 20,
                        "type": "integer",
                        "description": "Maximum nesting depth when walking a comment tree (0 = only top-level comments, 10 = ten levels of replies). Caps how deep the actor follows nested reply chains. Hard upper bound is 20.",
                        "default": 10
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
