Github Pull Request Scraper Api avatar

Github Pull Request Scraper Api

Under maintenance

Pricing

$4.99/month + usage

Go to Apify Store
Github Pull Request Scraper Api

Github Pull Request Scraper Api

Under maintenance

Extract GitHub pull requests with commits, reviews, comments & merge data. Monitor PR velocity, track code review metrics, analyze team productivity. Export to JSON/CSV for DevOps analytics, CI/CD automation & reporting. No API token needed. Fast Playwright scraper for developers & managers.

Pricing

$4.99/month + usage

Rating

0.0

(0)

Developer

Brennan Crawford

Brennan Crawford

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

1

Monthly active users

5 months ago

Last modified

Share

GitHub Pull Request Scraper & API

Extract pull requests, reviews, comments, commits, and merge status from any GitHub repository. Monitor PR activity, analyze code review metrics, and track contribution statistics with this fast, production-ready Playwright scraper.

🚀 Key Features

  • Comprehensive PR Data: Extract titles, states, authors, timestamps, labels, and more
  • Detailed Statistics: Get commits count, changed files, additions/deletions, and review metrics
  • Flexible Filtering: Filter by PR state (open, closed, merged) and limit results
  • Fast & Reliable: Built with Playwright for stable, production-ready scraping
  • Export Ready: Output to JSON or CSV for analytics, dashboards, and integrations
  • No Authentication Required: Scrape public repositories without GitHub API tokens

📊 Use Cases

  • DevOps Analytics: Track PR velocity, review times, and team productivity
  • Code Review Monitoring: Monitor PR activity and review patterns
  • Open Source Insights: Analyze contribution patterns in OSS projects
  • Team Metrics: Generate reports on code review efficiency
  • Competitive Intelligence: Track development activity in competitor repositories
  • Research & Analysis: Study PR trends and collaboration patterns

🎯 What Gets Scraped

Each pull request includes:

  • Basic Info: Title, number, state (Open/Closed/Merged), URL
  • Author Details: Username and profile URL
  • Timestamps: Created, updated, merged, and closed dates
  • Statistics: Comments count, commits count, changed files
  • Code Changes: Lines added and deleted
  • Metadata: Labels, reviewers, base/head branches
  • Repository: Owner and repo name

📝 Input Configuration

{
"repositoryUrl": "https://github.com/microsoft/vscode",
"state": "all",
"maxPRs": 50,
"includeDetails": true
}

Input Parameters

  • repositoryUrl (required): Full GitHub repository URL
  • state (optional): Filter PRs by state - all, open, or closed (default: all)
  • maxPRs (optional): Maximum number of PRs to scrape, 1-500 (default: 50)
  • includeDetails (optional): Include detailed stats like commits and file changes (default: true)

📤 Output Format

{
"title": "Add support for TypeScript 5.0",
"number": 12345,
"state": "Merged",
"author": "username",
"author_url": "https://github.com/username",
"created_at": "2024-01-15T10:30:00Z",
"merged_at": "2024-01-20T14:45:00Z",
"comments_count": 15,
"commits_count": 8,
"changed_files": 23,
"additions": 456,
"deletions": 123,
"labels": ["enhancement", "typescript"],
"reviewers": ["reviewer1", "reviewer2"],
"url": "https://github.com/microsoft/vscode/pull/12345",
"repository": "microsoft/vscode",
"base_branch": "main",
"head_branch": "feature/ts5-support"
}

🔧 How to Use

  1. Create a free Apify account at apify.com
  2. Search for "GitHub Pull Request Scraper" in Apify Store
  3. Configure input: Add repository URL and optional filters
  4. Click Start and wait for results
  5. Export data: Download as JSON, CSV, or integrate via API

⚡ Performance

  • Scrapes 50 PRs in ~2-3 minutes
  • Handles repositories with thousands of PRs
  • Optimized for speed with Playwright
  • Automatic retry on network errors

🛠️ Technical Details

  • Runtime: Python 3.11 with Playwright
  • Browser: Chromium (headless)
  • Rate Limiting: Respectful scraping with delays
  • Error Handling: Robust error recovery and logging

💡 Pro Tips

  • Set includeDetails: false for faster scraping if you don't need commit/file stats
  • Use state: "open" to monitor active PRs only
  • Increase maxPRs for comprehensive historical analysis
  • Schedule regular runs to track PR trends over time

📊 Integration Examples

Slack Notifications

Monitor new PRs and send alerts to your team channel

Analytics Dashboards

Feed PR data into Tableau, PowerBI, or custom dashboards

CI/CD Pipelines

Trigger workflows based on PR activity

Research Projects

Analyze OSS development patterns and collaboration

🔒 Privacy & Compliance

  • Only scrapes publicly available data
  • No authentication or API tokens required
  • Respects GitHub's robots.txt
  • Compliant with GitHub's Terms of Service for public data

🆘 Support

Need help? Have questions?

📜 License

This actor is available under the Apache 2.0 license.


Built with ❤️ using Apify and Playwright

Keywords: GitHub scraper, pull request scraper, PR analytics, code review metrics, GitHub API alternative, DevOps tools, repository analytics, open source insights, contribution tracking, GitHub data extraction