Stack Overflow Scraper - Questions, Answers & Comments
Pricing
from $0.30 / 1,000 results
Stack Overflow Scraper - Questions, Answers & Comments
Scrape questions, answers, and comments from Stack Overflow and the Stack Exchange network. Filter by tag, search, or user. Returns body, score, votes, accepted-answer flag. Built for AI/LLM training datasets, dev research, and tag-trend monitoring.
Pricing
from $0.30 / 1,000 results
Rating
0.0
(0)
Developer
NIJ KANANI
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a month ago
Last modified
Categories
Share
🟧 Stack Overflow Scraper
Scrape Stack Overflow and the entire Stack Exchange network — Server Fault, Super User, Math.SE, AskUbuntu, Data Science, etc. Pull questions by tag, search, user, or top-of-period. Optionally fetch full answers and comments.
🎯 The #1 source of high-quality technical Q&A on the internet — perfect for AI/LLM training, sentiment analysis, and dev research.
✨ What you can do
- 🏷️ Tag-based pulls —
python,machine-learning,react-native, etc. - 🔍 Full-text search — Stack Exchange's
/search/advanced - 👤 User activity — fetch any user's questions
- 🔥 Top of period — top of week/month/all-time
- 📅 Date range filter —
fromDate/toDate - 🎯 Min score filter — only quality content
- 💬 Optionally fetch answers + comments for each question
- 🌐 Works on any Stack Exchange site (
siteparameter)
🚀 Quick start
{"site": "stackoverflow","mode": "tag","tags": ["python", "machine-learning"],"sort": "votes","fromDate": "2026-01-01","minScore": 5,"includeAnswers": true,"includeComments": false,"maxItems": 500}
📥 Input
| Field | Description |
|---|---|
site | SE site (stackoverflow, serverfault, superuser, math, datascience, ...) |
mode | tag / search / user / top |
tags | Tag names (mode = tag) |
searchQueries | Free-text queries (mode = search) |
userIds | Numeric SE user IDs (mode = user) |
sort | activity / creation / votes / hot / week / month |
fromDate, toDate | ISO date range filter |
minScore | Skip below this score |
includeAnswers | Fetch all answers per question |
includeComments | Fetch comments on Q (and on A if includeAnswers) |
maxItems | Cap per target |
apiKey | (optional) Free key from https://stackapps.com — boosts quota 300→10,000/day |
📤 Output (per question)
{"site": "stackoverflow","type": "question","questionId": 12345678,"title": "How to do X in Python?","body": "<p>HTML body</p>","bodyMarkdown": "Markdown body","tags": ["python", "machine-learning"],"score": 42,"viewCount": 9876,"answerCount": 3,"isAnswered": true,"acceptedAnswerId": 99999,"creationDate": "2026-04-15T...","owner": { "userId": 123, "displayName": "user", "reputation": 50000, "profileUrl": "..." },"link": "https://stackoverflow.com/questions/12345678/...","answers": [{"answerId": 99999,"body": "<p>Answer HTML</p>","score": 78,"isAccepted": true,"creationDate": "...","owner": { "userId": 456, "displayName": "expert", "reputation": 100000 }}],"comments": [{ "commentId": 555, "body": "Comment text", "score": 5, "creationDate": "...", "owner": {...} }]}
🎯 Use cases
| Who | Why |
|---|---|
| 🤖 AI / LLM teams | Best-in-class technical Q&A for fine-tuning code/expert models |
| 📊 Dev relations | Track which language/framework tags are heating up |
| 🎓 Researchers | Code-discussion datasets, error-pattern analysis |
| 🛠️ Tool builders | Mine common pain points around your stack |
⚙️ Tech notes
- Uses Stack Exchange API v2.3
- Without API key: 300 requests/day quota (per IP)
- With free API key: 10,000 requests/day — strongly recommended for large pulls
- Auto-honors
backofffield if API asks us to slow down - Includes both HTML
bodyandbody_markdown
❓ FAQ
Where do I get an API key? Free at https://stackapps.com/apps/oauth/register — fill in any name, leave most fields blank. Takes 30 seconds.
Are deleted/closed questions included? The API skips deleted; closed questions are returned but flagged in the data.
Schedule it? Yes. Daily pulls of new questions on your favorite tags is a perfect Apify Schedule use case.