# Reddit Posts Search Scraper (`vulnv/reddit-posts-search-scraper`) Actor

Search and scrape Reddit posts by keyword. Extract detailed post data, comments, scores, timestamps, and metadata for research and analysis.

- **URL**: https://apify.com/vulnv/reddit-posts-search-scraper.md
- **Developed by:** [VulnV](https://apify.com/vulnv) (community)
- **Categories:** Lead generation, Social media, News
- **Stats:** 519 total users, 133 monthly users, 100.0% runs succeeded, 12 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Reddit Search Scraper: Extract Reddit Data by Keywords

### Overview
The **Reddit Search Scraper** is a powerful tool designed to extract detailed post data from Reddit using keyword-based searches. Whether you're conducting sentiment analysis, monitoring trends, or gathering insights for research, this scraper enables efficient and accurate data collection across all of Reddit using targeted keywords.

### Features
- **Comprehensive Data Extraction**: Retrieve essential details such as:
  - Post title
  - Author
  - Creation timestamp
  - Number of comments
  - Score (upvotes)
  - Permalink
  - Image and thumbnail URLs (if available)
  - Post body content
  - Comments

- **Flexible Input Parameters**: Customize your scraping with options like:
  - Search keyword or phrase
  - Post limit (1-1000)
  - Sort method (relevance, hot, top, new, comments)
  - Time filter (hour, day, week, month, year, all)

- **Structured Output**: Export the collected data in JSON format for seamless integration with analytical tools.

### Usage
#### Input Configuration
The scraper accepts the following input parameters:

1. **Search Keyword**:
   - Specify the keyword or phrase to search for across Reddit.
   - Example:
     ```json
     {
       "keyword": "artificial intelligence",
       "limit": 25,
       "sort": "relevance",
       "time_filter": "week"
     }
     ```

2. **Limit**:
   - Set the number of posts to scrape (minimum: 1, maximum: 1000, default: 25).

3. **Sort**:
   - Choose from "relevance", "hot", "top", "new", or "comments".

4. **Time Filter**:
   - Specify a time range for filtering posts (hour, day, week, month, year, all).


#### Output Data
The scraper collects the following information for each post:
- Title
- Author
- Created UTC timestamp
- Number of comments
- Score (upvotes/downvotes)
- Permalink (direct link to the post)
- Subreddit name
- Post URL
- Image URL (if available)
- Thumbnail URL (if available)
- Post body text (selftext)
- Search keyword used
- Detailed comments with nested replies

The output is stored in JSON format, allowing easy data processing.

### Quick Start
1. Install the scraper from the [Apify Marketplace](https://apify.com).
2. Configure the input parameters in the input schema.
3. Run the scraper and download the output in JSON format.

### Example Use Cases
- **Keyword Trend Monitoring**: Track discussions about specific topics, products, or events across all of Reddit.
- **Brand Monitoring**: Monitor mentions of your brand, products, or competitors across Reddit communities.
- **Sentiment Analysis**: Analyze user sentiment about specific topics based on post and comment content.
- **Research & Data Collection**: Gather comprehensive data about specific subjects for academic or market research.
- **Content Discovery**: Find relevant content and discussions related to your interests or business domain.

### Output Storage
The scraper stores results in a structured dataset, allowing easy access to:
- Title
- URL
- Post metadata
- Comments
- Images (if applicable)

### Output Example
```json
{
  "title": "The Impact of Artificial Intelligence on Modern Healthcare",
  "author": "healthtech_researcher",
  "permalink": "/r/MachineLearning/comments/1abc234/the_impact_of_artificial_intelligence_on_modern/",
  "score": 847,
  "num_comments": 156,
  "created_utc": 1704067200,
  "subreddit": "MachineLearning",
  "url": "https://www.reddit.com/r/MachineLearning/comments/1abc234/the_impact_of_artificial_intelligence_on_modern/",
  "selftext": "Recent advances in AI have revolutionized diagnostic imaging, drug discovery, and personalized treatment protocols. This comprehensive analysis explores the current state and future potential of AI applications in healthcare settings.",
  "keyword": "artificial intelligence",
  "image_url": "https://preview.redd.it/ai_healthcare_chart_abc123.png?width=960&crop=smart&auto=webp&s=1234567890abcdef",
  "thumbnail_url": "https://b.thumbs.redditmedia.com/ai_healthcare_thumb_abc123.jpg",
  "comments": [
    {
      "author": "medical_ai_expert",
      "body": "This is an excellent overview. I've been working in medical AI for 5 years and can confirm the transformative impact on radiology workflows. The accuracy improvements in detecting early-stage cancers are remarkable.",
      "score": 234,
      "replies": [
        {
          "author": "curious_student",
          "body": "Could you elaborate on specific accuracy improvements? I'm particularly interested in mammography screening applications.",
          "score": 67,
          "replies": [
            {
              "author": "medical_ai_expert",
              "body": "Certainly! Recent studies show AI-assisted mammography reduces false positives by 40% while maintaining 99.2% sensitivity. The Google DeepMind collaboration with Cancer Research UK demonstrated particularly promising results.",
              "score": 89,
              "replies": []
            }
          ]
        },
        {
          "author": "radiologist_practitioner",
          "body": "As a practicing radiologist, I can attest to these improvements. Our department implemented AI screening tools last year and saw a 25% reduction in reading time while improving diagnostic confidence.",
          "score": 156,
          "replies": []
        }
      ]
    },
    {
      "author": "pharma_researcher",
      "body": "The drug discovery section resonates strongly with my experience. AI has compressed our initial compound screening from 18 months to 6 months. The cost savings alone justify the technology investment.",
      "score": 178,
      "replies": [
        {
          "author": "biotech_startup",
          "body": "Which AI platforms are you using for compound screening? We're evaluating options for our pipeline.",
          "score": 43,
          "replies": [
            {
              "author": "pharma_researcher",
              "body": "We primarily use Atomwise for small molecule discovery and BenevolentAI for target identification. Both have shown excellent ROI in our trials.",
              "score": 52,
              "replies": []
            }
          ]
        }
      ]
    },
    {
      "author": "ethics_healthcare",
      "body": "While the technological advances are impressive, we must carefully consider the ethical implications of AI decision-making in healthcare. Patient consent, algorithmic bias, and the human element in medical care are crucial considerations.",
      "score": 298,
      "replies": [
        {
          "author": "medical_ethicist",
          "body": "Absolutely critical points. The FDA's recent guidance on AI/ML-based medical devices addresses some concerns, but we need more comprehensive frameworks for ethical AI deployment in clinical settings.",
          "score": 124,
          "replies": []
        },
        {
          "author": "patient_advocate",
          "body": "As someone who has benefited from AI-assisted diagnosis, I appreciate both the technology and the emphasis on ethical considerations. Patients should always understand how AI influences their care decisions.",
          "score": 97,
          "replies": []
        }
      ]
    }
  ]
}
````

### Explore More Actors

✨ **Looking for additional solutions?** Check out more actors on Apify that can help with your web automation and data extraction needs. Discover a wide range of tools tailored for different scenarios at 🌐 [Explore Vulnv's Actors on Apify](https://apify.com/vulnv).

📧 For inquiries or support, feel free to reach out to us at apify@vulnv.com.

# Actor input Schema

## `keyword` (type: `string`):

The keyword or phrase to search for in Reddit posts

## `limit` (type: `integer`):

The maximum number of posts to scrape

## `sort` (type: `string`):

How to sort the search results

## `time_filter` (type: `string`):

Filter posts by time period

## Actor input object example

```json
{
  "keyword": "artificial intelligence",
  "limit": 25,
  "sort": "relevance",
  "time_filter": "week"
}
```

# Actor output Schema

## `results` (type: `string`):

Reddit posts matching the search keyword, including post metadata and comments.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("vulnv/reddit-posts-search-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("vulnv/reddit-posts-search-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call vulnv/reddit-posts-search-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=vulnv/reddit-posts-search-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Reddit Posts Search Scraper",
        "description": "Search and scrape Reddit posts by keyword. Extract detailed post data, comments, scores, timestamps, and metadata for research and analysis.",
        "version": "1.0",
        "x-build-id": "YMdo2i0RXMhmrLbkt"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/vulnv~reddit-posts-search-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-vulnv-reddit-posts-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/vulnv~reddit-posts-search-scraper/runs": {
            "post": {
                "operationId": "runs-sync-vulnv-reddit-posts-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/vulnv~reddit-posts-search-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-vulnv-reddit-posts-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "keyword"
                ],
                "properties": {
                    "keyword": {
                        "title": "Search Keyword",
                        "minLength": 1,
                        "type": "string",
                        "description": "The keyword or phrase to search for in Reddit posts",
                        "default": "artificial intelligence"
                    },
                    "limit": {
                        "title": "Post Limit",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "The maximum number of posts to scrape",
                        "default": 25
                    },
                    "sort": {
                        "title": "Sort By",
                        "enum": [
                            "relevance",
                            "hot",
                            "top",
                            "new",
                            "comments"
                        ],
                        "type": "string",
                        "description": "How to sort the search results",
                        "default": "relevance"
                    },
                    "time_filter": {
                        "title": "Time Filter",
                        "enum": [
                            "hour",
                            "day",
                            "week",
                            "month",
                            "year",
                            "all"
                        ],
                        "type": "string",
                        "description": "Filter posts by time period",
                        "default": "week"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
