Stack Overflow Scraper - Questions, Answers & Comments avatar

Stack Overflow Scraper - Questions, Answers & Comments

Pricing

from $0.30 / 1,000 results

Go to Apify Store
Stack Overflow Scraper - Questions, Answers & Comments

Stack Overflow Scraper - Questions, Answers & Comments

Scrape questions, answers, and comments from Stack Overflow and the Stack Exchange network. Filter by tag, search, or user. Returns body, score, votes, accepted-answer flag. Built for AI/LLM training datasets, dev research, and tag-trend monitoring.

Pricing

from $0.30 / 1,000 results

Rating

0.0

(0)

Developer

NIJ KANANI

NIJ KANANI

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a month ago

Last modified

Share

🟧 Stack Overflow Scraper

Scrape Stack Overflow and the entire Stack Exchange network — Server Fault, Super User, Math.SE, AskUbuntu, Data Science, etc. Pull questions by tag, search, user, or top-of-period. Optionally fetch full answers and comments.

🎯 The #1 source of high-quality technical Q&A on the internet — perfect for AI/LLM training, sentiment analysis, and dev research.


✨ What you can do

  • 🏷️ Tag-based pullspython, machine-learning, react-native, etc.
  • 🔍 Full-text search — Stack Exchange's /search/advanced
  • 👤 User activity — fetch any user's questions
  • 🔥 Top of period — top of week/month/all-time
  • 📅 Date range filterfromDate / toDate
  • 🎯 Min score filter — only quality content
  • 💬 Optionally fetch answers + comments for each question
  • 🌐 Works on any Stack Exchange site (site parameter)

🚀 Quick start

{
"site": "stackoverflow",
"mode": "tag",
"tags": ["python", "machine-learning"],
"sort": "votes",
"fromDate": "2026-01-01",
"minScore": 5,
"includeAnswers": true,
"includeComments": false,
"maxItems": 500
}

📥 Input

FieldDescription
siteSE site (stackoverflow, serverfault, superuser, math, datascience, ...)
modetag / search / user / top
tagsTag names (mode = tag)
searchQueriesFree-text queries (mode = search)
userIdsNumeric SE user IDs (mode = user)
sortactivity / creation / votes / hot / week / month
fromDate, toDateISO date range filter
minScoreSkip below this score
includeAnswersFetch all answers per question
includeCommentsFetch comments on Q (and on A if includeAnswers)
maxItemsCap per target
apiKey(optional) Free key from https://stackapps.com — boosts quota 300→10,000/day

📤 Output (per question)

{
"site": "stackoverflow",
"type": "question",
"questionId": 12345678,
"title": "How to do X in Python?",
"body": "<p>HTML body</p>",
"bodyMarkdown": "Markdown body",
"tags": ["python", "machine-learning"],
"score": 42,
"viewCount": 9876,
"answerCount": 3,
"isAnswered": true,
"acceptedAnswerId": 99999,
"creationDate": "2026-04-15T...",
"owner": { "userId": 123, "displayName": "user", "reputation": 50000, "profileUrl": "..." },
"link": "https://stackoverflow.com/questions/12345678/...",
"answers": [
{
"answerId": 99999,
"body": "<p>Answer HTML</p>",
"score": 78,
"isAccepted": true,
"creationDate": "...",
"owner": { "userId": 456, "displayName": "expert", "reputation": 100000 }
}
],
"comments": [
{ "commentId": 555, "body": "Comment text", "score": 5, "creationDate": "...", "owner": {...} }
]
}

🎯 Use cases

WhoWhy
🤖 AI / LLM teamsBest-in-class technical Q&A for fine-tuning code/expert models
📊 Dev relationsTrack which language/framework tags are heating up
🎓 ResearchersCode-discussion datasets, error-pattern analysis
🛠️ Tool buildersMine common pain points around your stack

⚙️ Tech notes

  • Uses Stack Exchange API v2.3
  • Without API key: 300 requests/day quota (per IP)
  • With free API key: 10,000 requests/day — strongly recommended for large pulls
  • Auto-honors backoff field if API asks us to slow down
  • Includes both HTML body and body_markdown

❓ FAQ

Where do I get an API key? Free at https://stackapps.com/apps/oauth/register — fill in any name, leave most fields blank. Takes 30 seconds.

Are deleted/closed questions included? The API skips deleted; closed questions are returned but flagged in the data.

Schedule it? Yes. Daily pulls of new questions on your favorite tags is a perfect Apify Schedule use case.