# Reddit API Scraper (`simpleapi/reddit-api-scraper`) Actor

Reddit API Scraper collects data from Reddit posts, comments, and subreddits using API-based extraction. Gather post titles, text, usernames, scores, timestamps, and engagement metrics to analyze trends, monitor discussions, or build datasets for research, marketing, and insights. 📊💬

- **URL**: https://apify.com/simpleapi/reddit-api-scraper.md
- **Developed by:** [SimpleAPI](https://apify.com/simpleapi) (community)
- **Categories:** Social media, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$19.99/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit API Scraper

**Reddit Scraper** is an Apify actor that extracts data from Reddit by keyword search. It uses Reddit’s public search API and returns posts in a structured format. No login is required. You can use it as a **Reddit scraper** or **alternative to the Reddit API** for keyword-based search.

---

### Why Choose Us?

- **No proxy by default** – Sends requests directly to Reddit; uses proxy only when blocked.
- **Automatic proxy fallback** – If Reddit blocks the request, the actor falls back to datacenter proxy, then to residential proxy (with retries), and sticks with residential for the rest of the run.
- **Bulk keywords** – Search multiple keywords in one run.
- **Same output shape** – Output is a single JSON object: keys = keywords, values = arrays of posts (same structure as the reference `output.json`).

---

### Key Features

| Feature | Description |
|--------|-------------|
| **Search by keyword** | One or more search terms (bulk input). |
| **Multiple strategies** | Uses several sort strategies (new, relevance, hot, top, etc.) to maximize results. |
| **Rate limiting** | Delays and semaphores to reduce blocking. |
| **Retries** | Up to 3 retries with exponential backoff; special handling for 403. |
| **Proxy fallback** | No proxy → datacenter → residential, with clear logging. |
| **Structured output** | Each post includes `metaData.keyword`, `id`, `subreddit`, `title`, `author`, `permalink`, `url`, `selftext`, and other Reddit fields. |

---

### Input

Configure the actor with these inputs (Form or JSON in Apify Console).

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| **Search keywords** | array (stringList) | Yes | Keywords to search on Reddit (e.g. `webscraping`, `python`). Supports bulk edit. |
| **Subreddit names** | array (stringList) | No | Optional subreddits to limit search. |
| **Results limit per keyword and subreddit** | integer | No | Max posts per keyword (default: 5, max: 1000). |
| **Sorting** | string | No | Sort order: `new`, `hot`, `top`, `relevance` (default: `new`). |
| **Proxy Configuration** | object (proxy) | No | By default no proxy. Enable Apify Proxy if you want to force proxy from the start. Fallback (datacenter → residential) runs when Reddit blocks. |

#### Example input (JSON)

```json
{
  "searchKeywords": ["webscraping", "python"],
  "subredditNames": [],
  "resultsLimitPerKeyword": 5,
  "sorting": "new",
  "proxyConfiguration": { "useApifyProxy": false }
}
````

***

### Output

The dataset contains **one item**: a JSON object where each key is a **keyword** and each value is an **array of post objects**. Same structure as the reference `output.json`.

#### Example output structure

```json
{
  "webscraping": [
    {
      "metaData": { "keyword": "webscraping" },
      "id": "abc123",
      "subreddit": "Python",
      "selftext": "...",
      "author_fullname": "t2_xxx",
      "title": "Post title",
      "subreddit_name_prefixed": "r/Python",
      "name": "t3_abc123",
      "link_flair_text_color": "dark",
      "subreddit_type": "public",
      "thumbnail": "self",
      "link_flair_type": "text",
      "author_flair_type": "text",
      "domain": "self.Python",
      "selftext_html": "...",
      "subreddit_id": "t5_xxx",
      "author": "username",
      "permalink": "/r/Python/comments/...",
      "url": "https://www.reddit.com/..."
    }
  ],
  "python": [ ... ]
}
```

| Field | Description |
|-------|-------------|
| `metaData.keyword` | Search keyword for this post. |
| `id` | Reddit post ID. |
| `subreddit` | Subreddit name. |
| `title` | Post title. |
| `author` | Author username. |
| `permalink` | Relative link to the post. |
| `url` | Full URL. |
| `selftext` | Post body text. |

***

### How to Use the Actor (via Apify Console)

1. Log in at <https://console.apify.com> and go to **Actors**.
2. Find **Reddit API Scraper** (or `reddit-api-scraper`) and open it.
3. Open the **Input** tab (Form or JSON).
4. Enter **Search keywords** (e.g. `webscraping`; add more with **+ Add** or **Bulk edit**).
5. Optionally set **Results limit per keyword**, **Sorting**, and **Proxy Configuration**.
6. Click **Start**.
7. Watch **Log** for progress and proxy fallback messages.
8. Open the **Output** tab to see the dataset (one item = object of keywords → posts).
9. Export to JSON or use via API.

***

### Best Use Cases

- Monitoring Reddit for keywords (brand, product, topic).
- Research or sentiment on public discussions.
- Building datasets of Reddit posts by topic.
- Alternative to Reddit API for simple search-based scraping.

***

### Frequently Asked Questions

**Do I need a Reddit API key?**\
No. The actor uses Reddit’s public search endpoint; no authentication is required.

**Why did it switch to proxy?**\
If you see “Falling back to datacenter/residential proxy” in the log, Reddit returned 403 (block). The actor then uses Apify proxies and continues; once it switches to residential, it stays on residential for the rest of the run.

**Can I scrape private subreddits?**\
No. Only publicly available content is accessible.

***

### Support and Feedback

Use the Apify actor’s **Issues** or **Reviews** for bugs and feature requests.

***

### Cautions

- Data is collected only from **publicly available** Reddit content.
- No private accounts or password-protected content are accessed.
- You are responsible for compliance with applicable laws (e.g. privacy, data protection, spam).

### What are other Reddit scraping tools?

If you want to scrape specific Reddit data, you can use any of the dedicated scrapers below for faster and more targeted results.

| Scraper Name | Scraper Name |
|---|---|
| [Reddit Comments Scraper](https://apify.com/simpleapi/reddit-comments-scraper) | [Reddit Scraper](https://apify.com/simpleapi/reddit-scraper) |
| [Reddit Email Scraper](https://apify.com/simpleapi/reddit-email-scraper) | [Reddit Subreddit Members Scraper](https://apify.com/simpleapi/reddit-subreddit-members-scraper) |
| [Reddit Lead Scraper](https://apify.com/simpleapi/reddit-lead-scraper) | [Reddit Trends Scraper](https://apify.com/simpleapi/reddit-trends-scraper) |
| [Reddit Phone Number Scraper](https://apify.com/simpleapi/reddit-phone-number-scraper) | [Reddit User Profile Posts And Comments Scraper](https://apify.com/simpleapi/reddit-user-profile-posts-and-comments-scraper) |
| [Reddit Posts Scraper](https://apify.com/simpleapi/reddit-posts-scraper) |  |

# Actor input Schema

## `searchKeywords` (type: `array`):

Enter the words or phrases you want to search on Reddit (e.g. webscraping, python, ChatGPT, AI). You can add multiple keywords — use + Add or Bulk edit. Each keyword is searched across Reddit and results are grouped by keyword in the output. One keyword per line in bulk mode.

## `subredditNames` (type: `array`):

Optional. Restrict search to specific subreddits only (e.g. python, programming, learnprogramming). Leave empty to search all of Reddit. Add multiple with + Add or Bulk edit. Improves relevance when you know which communities to target.

## `resultsLimitPerKeyword` (type: `integer`):

Maximum number of posts to fetch per keyword (1–1000). Higher values may take longer and can trigger rate limits; the actor uses delays and retries to stay safe. Start with 10–50 for testing. Default: 10.

## `sorting` (type: `string`):

How Reddit search results are ordered: New (latest first), Hot (trending now), Top (most upvoted), or Relevance (best match to your keyword). Default: New.

## `proxyConfiguration` (type: `object`):

By default no proxy is used (direct requests to Reddit). If Reddit blocks or rate-limits the request, the actor automatically falls back to datacenter proxy, then to residential proxy (with up to 3 retries). Once it switches to residential, it stays on residential for all remaining requests. Turn on Apify Proxy here if you want to use proxy from the very first request.

## Actor input object example

```json
{
  "searchKeywords": [
    "webscraping"
  ],
  "resultsLimitPerKeyword": 10,
  "sorting": "new",
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchKeywords": [
        "webscraping"
    ],
    "proxyConfiguration": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("simpleapi/reddit-api-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchKeywords": ["webscraping"],
    "proxyConfiguration": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("simpleapi/reddit-api-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchKeywords": [
    "webscraping"
  ],
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}' |
apify call simpleapi/reddit-api-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=simpleapi/reddit-api-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit API Scraper",
        "description": "Reddit API Scraper collects data from Reddit posts, comments, and subreddits using API-based extraction. Gather post titles, text, usernames, scores, timestamps, and engagement metrics to analyze trends, monitor discussions, or build datasets for research, marketing, and insights. 📊💬",
        "version": "0.1",
        "x-build-id": "vU9FFeJW8avn9xP2l"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/simpleapi~reddit-api-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-simpleapi-reddit-api-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/simpleapi~reddit-api-scraper/runs": {
            "post": {
                "operationId": "runs-sync-simpleapi-reddit-api-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/simpleapi~reddit-api-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-simpleapi-reddit-api-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "searchKeywords"
                ],
                "properties": {
                    "searchKeywords": {
                        "title": "🔑 Search keywords (required)",
                        "type": "array",
                        "description": "Enter the words or phrases you want to search on Reddit (e.g. webscraping, python, ChatGPT, AI). You can add multiple keywords — use + Add or Bulk edit. Each keyword is searched across Reddit and results are grouped by keyword in the output. One keyword per line in bulk mode.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "subredditNames": {
                        "title": "📌 Subreddit names",
                        "type": "array",
                        "description": "Optional. Restrict search to specific subreddits only (e.g. python, programming, learnprogramming). Leave empty to search all of Reddit. Add multiple with + Add or Bulk edit. Improves relevance when you know which communities to target.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "resultsLimitPerKeyword": {
                        "title": "📊 Results limit per keyword",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of posts to fetch per keyword (1–1000). Higher values may take longer and can trigger rate limits; the actor uses delays and retries to stay safe. Start with 10–50 for testing. Default: 10.",
                        "default": 10
                    },
                    "sorting": {
                        "title": "📅 Sorting",
                        "enum": [
                            "new",
                            "hot",
                            "top",
                            "relevance"
                        ],
                        "type": "string",
                        "description": "How Reddit search results are ordered: New (latest first), Hot (trending now), Top (most upvoted), or Relevance (best match to your keyword). Default: New.",
                        "default": "new"
                    },
                    "proxyConfiguration": {
                        "title": "🌐 Proxy configuration",
                        "type": "object",
                        "description": "By default no proxy is used (direct requests to Reddit). If Reddit blocks or rate-limits the request, the actor automatically falls back to datacenter proxy, then to residential proxy (with up to 3 retries). Once it switches to residential, it stays on residential for all remaining requests. Turn on Apify Proxy here if you want to use proxy from the very first request."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
