Pricing

from $2.99 / 1,000 results

Go to Apify Store

Reddit Comments Search Scraper

Try for free

Pricing

from $2.99 / 1,000 results

Rating

0.0

(0)

Developer

SimpleAPI

Actor stats

Bookmarked

Total users

Monthly active users

a day ago

Last modified

🔍 Reddit Search Scraper

Scrape Reddit search results and subreddit listings at scale — paste any Reddit URL (search, subreddit, or subreddit search) and the actor pulls clean structured records from public Reddit data archives (no Reddit login or API key required) and live-saves each post to the dataset.

ℹ️ How it works: Reddit shut down unauthenticated access to its public .json endpoints. This actor instead reads from two public Reddit data archives — PullPush (primary, full-text + subreddit search) and Arctic Shift (fallback for subreddit/author queries) — so it keeps working without you registering a Reddit OAuth app.

💡 Built for marketers, researchers, AI/LLM data pipelines, and competitive-intelligence teams who need clean, structured Reddit data without scraping headaches.

✨ Why choose this Actor?

🚀 Fast — pure async HTTP, no headless browser overhead.
🔓 No credentials needed — reads public Reddit archives, so there's no OAuth app, client ID, or rate-limited Reddit key to manage.
🛡️ Smart proxy ladder — starts direct, auto-falls-back to datacenter → residential if an archive rate-limits the request IP, and stays on residential once it kicks in.
🔁 Resilient — per-request retries with jittered backoff, and 3 retries on the residential tier before giving up.
💾 Live saving — every post is pushed to the dataset as it's scraped, so a mid-run crash never loses work.
🧱 Bulk URLs — feed it any number of Reddit URLs in one run.
📊 Pre-built dataset views — Overview, Post, Subreddit, Author, Content, and Full Record tabs in the Apify Console.

🎯 Key features

🌐 Bulk URL input (search URLs, subreddit URLs, subreddit search URLs)
🔎 Optional keyword fallback when no URLs are supplied
📊 Sort by Relevance / Hot / Top / New / Most Comments
🔞 Safe-search toggle
📦 Hard cap on total items via maxItems
🛡️ Default no-proxy, auto-escalating fallback ladder
📝 Detailed real-time logs so you can watch progress live

📥 Input

{
  "urls": [
    { "url": "https://www.reddit.com/search/?q=ai&sort=new" },
    { "url": "https://www.reddit.com/r/python/" }
  ],
  "query": "artificial intelligence",
  "sort": "relevance",
  "safeSearch": "off",
  "maxItems": 300,
  "maxRetries": 3,
  "proxyConfiguration": { "useApifyProxy": false }
}

Field	Type	Description
`urls`	array	Reddit URLs to scrape (search, subreddit, or subreddit search).
`query`	string	Keyword fallback used only when `urls` is empty.
`sort`	enum	`relevance` / `hot` / `top` / `new` / `comments`.
`safeSearch`	enum	`off` (include NSFW) or `on` (hide NSFW).
`maxItems`	integer	Hard cap on total posts across all URLs.
`maxRetries`	integer	Per-request retries before escalating proxy tier.
`proxyConfiguration`	object	Standard Apify proxy input. Defaults to no proxy.

📤 Output

Each dataset record matches the original reference shape exactly, plus a few top-level mirror fields so the table views work without nested-path lookups:

{
  "post": {
    "title": "The more young people use AI, the more they hate it",
    "url": "https://www.reddit.com/r/technology/comments/1szusu6/the_more_young_people_use_ai_the_more_they_hate_it/",
    "score": 22036,
    "comment_count": 1612
  },
  "subreddit": { "name": "technology" },
  "author":    { "name": "spherocytes" },
  "contentText": "",
  "content_type": "link",
  "created_timestamp": "2026-04-30T12:34:21.000000+0000",

  "title": "The more young people use AI, the more they hate it",
  "subreddit_name": "technology",
  "author_name": "spherocytes",
  "score": 22036,
  "comment_count": 1612,
  "url": "https://www.reddit.com/r/technology/comments/1szusu6/the_more_young_people_use_ai_the_more_they_hate_it/"
}

🚀 How to use the Actor (via Apify Console)

🔐 Log in at console.apify.com → Actors.
🔎 Find Reddit Search Scraper and open it.
📝 Paste one or more Reddit URLs (or type a keyword in the query field).
⚙️ Pick a sort (Relevance / Hot / Top / New / Most Comments) and set maxItems.
🛡️ Leave Proxy on default (no proxy) — the scraper auto-escalates if Reddit pushes back.
▶️ Click Start.
📊 Watch logs in real time; open the Output tab as records stream in.
📁 Export to JSON / CSV / Excel.

🛡️ Proxy strategy

The scraper uses a three-tier ladder (the archives can rate-limit a busy IP):

Tier	When it's used
🌐 Direct	Default — the archives usually serve fine without a proxy.
🏢 Datacenter	Auto-engaged if direct requests get 403 / 429 / rate-limited.
🏠 Residential	Auto-engaged if datacenter still fails. Retries then sticks for the rest of the run.

You can also start higher up the ladder by selecting a proxy group in the input.

📊 Sort & data-source notes

Source: PullPush handles global keyword search and subreddit/author search; Arctic Shift serves subreddit- and author-scoped queries as a fast fallback. Both are public Reddit archives.
Sort mapping — Reddit's sort intents map onto the archives' sort fields:
- 🎯 Relevance / ⭐ Top / 🔥 Hot → highest score first
- 🆕 New → newest created first
- 💬 Most Comments → highest comment count first
Coverage: archives index publicly posted content; very recent posts (last few minutes) or removed content may not appear. Pagination walks backward in time, so large maxItems runs are ordered newest-to-oldest within each time window.

💼 Best use cases

🤖 Building AI / LLM training datasets from Reddit discussion
📊 Brand monitoring & sentiment analysis
🧠 Market research and competitive intelligence
📝 Content trend discovery
🔬 Academic research on online communities

❓ Frequently asked questions

Q: Does it scrape comments? A: This actor returns post-level metadata (title, score, comment count, body text). For per-post comment threads, use an additional actor or extend this one to fetch <permalink>.json.

Q: Does it support private subreddits? A: No — only publicly accessible subreddits and search results.

Q: Do I need a Reddit account or API key? A: No. The actor reads public Reddit data archives, so there's nothing to register or authenticate.

Q: What happens if an archive rate-limits me? A: The scraper auto-escalates the proxy tier (direct → datacenter → residential) and retries. If every tier still fails, the run ends with a clear status message.

📨 Support and feedback

For issues, custom features, or feedback: dev.scraperengine@gmail.com

⚠️ Legal & ethical use

Only collect data from publicly accessible Reddit pages.
Respect Reddit's terms of service and applicable privacy laws (GDPR / CCPA).
The end user is responsible for downstream use of the data.

Reddit Comments Search Scraper

easyapi/reddit-comments-search-scraper

Search and extract Reddit comments with advanced filtering options. Get detailed metadata including comment content, author info, post context, and engagement metrics. Perfect for sentiment analysis, trend research, and social media monitoring.

EasyApi

275

5.0

(1)

Reddit Comments Search Scraper 🔍📥💬 - Cheap

scrapestorm/reddit-comments-search-scraper---cheap

🔍 Easily Search Reddit Comments Enter a keyword to search and collect data on relevant Reddit comments 💬 Get insights such as comment text, author, score, subreddit, timestamp & more 🕵️‍♂️ Seamlessly integrate with tools like Google Drive to automate workflows and boost productivity ⚡📈

Storm_Scraper

Reddit Comments Search Scraper

maximedupre/reddit-comments-search-scraper

Search public Reddit comments by keyword or subreddit. Export comment text, authors, scores, permalinks, post context, timestamps, and scrape metadata for monitoring, research, and analysis.

Maxime Dupré

Reddit Comments Search Scraper

api-empire/reddit-comments-search-scraper

API Empire

Reddit Comments Search Scraper

scraper-engine/reddit-comments-search-scraper

Scraper Engine

Reddit Comments Search Scraper

scrapier/reddit-comments-search-scraper

Scrape Reddit comments by URL or keyword. Returns structured records with subreddit, author, score, comment count, content, and timestamps. Auto-falls-back through direct → datacenter → residential proxies if Reddit rate-limits the request.

Scrapier

Reddit Posts Search Scraper

easyapi/reddit-posts-search-scraper

Extract Reddit posts from search results with rich metadata, including media content, engagement metrics, and community information. Perfect for content research, trend analysis, and social media monitoring across Reddit communities.

EasyApi

544

5.0

(1)

Reddit Posts Search Scraper 🔍📥 - Cheap

scrapestorm/reddit-posts-search-scraper---cheap

🔍 Easily Collect Reddit Post Data by Keyword Enter a keyword to fetch Reddit posts with title, URL, votes, comments, subreddit, and more. 📝 Integrate effortlessly with tools like Google Drive or Zapier to automate workflows and boost productivity. ⚡📊

Storm_Scraper

Substack Notes Scraper 🔍

easyapi/substack-notes-scraper

Extract notes and comments from Substack's search results with images, user info, and engagement metrics. Perfect for content analysis, user research, and tracking discussions around specific topics on Substack.

EasyApi

Reddit Trends Posts Scraper 📈🔥- Cheap

scrapestorm/reddit-trends-posts-scraper---cheap

📈 Instantly Discover What’s Trending on Reddit Effortlessly fetch the hottest posts across Reddit 🔥 Get key insights like post titles, scores, comments, subreddits, authors & more 🧠 Perfect for trend tracking, content curation, and integrating into dashboards or workflows ⚡📊

Storm_Scraper

4.8

(16)

Reddit Media Downloader 🎥

easyapi/reddit-media-downloader

Extract and download high-quality videos, GIFs, and images from Reddit posts with ease. This actor handles v.redd.it links and provides multiple quality options, complete with audio streams. Perfect for content creators and Reddit media archiving! 🎥✨

EasyApi