# Developer Tools Scraper (`datapilot/developer-tools-scraper`) Actor

Package & Developer Ecosystem Scraper collects package, extension, and repository data from PyPI, npm, VS Code Marketplace, and GitHub. Extracts names, versions, descriptions, authors, licenses, downloads, ratings, keywords, and URLs. Ideal for developer research, trend analysis, lead generation

- **URL**: https://apify.com/datapilot/developer-tools-scraper.md
- **Developed by:** [Data Pilot](https://apify.com/datapilot) (community)
- **Categories:** Developer tools, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 scraped results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Developer Tools Scraper

🛠️ **Developer Tools Scraper** is a powerful Apify Actor designed to discover and aggregate comprehensive **Developer Tools** data from multiple sources including PyPI, npm, VS Code Marketplace, and GitHub. This tool provides detailed **Developer Tools** information including descriptions, authors, licenses, and usage metrics. Whether you're researching tools, comparing packages, or building developer intelligence, the Developer Tools Scraper delivers actionable **Developer Tools** insights efficiently.

With multi-source aggregation from PyPI, npm, VS Code Marketplace, and GitHub, concurrent API queries, intelligent deduplication, and real-time dataset integration, the Developer Tools Scraper ensures comprehensive discovery of relevant **Developer Tools** options. It focuses on key **Developer Tools** metrics including downloads, stars, ratings, and metadata, making it an essential tool for **Developer Tools** research and technology stack evaluation.

---

### 📋 Table of Contents

- [Features](#-features)
- [Data Sources](#-data-sources)
- [How It Works](#-how-it-works)
- [Input](#-input)
- [Output](#-output)
- [Technical Stack](#-technical-stack)
- [Data Fields](#-data-fields)
- [Source Comparison](#-source-comparison)
- [Use Cases](#-use-cases)
- [Quick Start](#-quick-start)
- [Configuration](#-configuration)
- [Performance](#-performance)
- [Important Notes](#-important-notes)
- [Keywords](#-keywords)
- [Changelog](#-changelog)
- [Support](#-support)

---

### 🔥 Features

- **Multi-Source Aggregation** – Search **Developer Tools** across PyPI, npm, VS Code, and GitHub simultaneously.
- **PyPI Integration** – Discover Python packages and libraries from the official Python Package Index.
- **npm Integration** – Search JavaScript/Node.js packages from the npm registry.
- **VS Code Marketplace** – Find VS Code extensions and developer tools.
- **GitHub Discovery** – Search open-source repositories on GitHub.
- **Concurrent Fetching** – Multi-threaded concurrent requests to all sources.
- **Detail Enrichment** – Fetch comprehensive metadata for each **Developer Tools** item.
- **Author Information** – Extract author/publisher information across sources.
- **License Extraction** – Capture license information for compliance.
- **Download Metrics** – Includes download counts, stars, ratings where available.
- **Rating Aggregation** – Captures ratings and review counts from VS Code.
- **Keyword Matching** – Extracts keywords and tags for categorization.
- **Homepage URLs** – Captures project homepages and repositories.
- **Version Tracking** – Records current version information.
- **Creation Dates** – Includes package/project creation and update dates.
- **Deduplication** – Removes duplicates across sources.
- **Proxy Support** – Apify residential proxy support for reliable access.
- **GitHub Token Support** – Optional GitHub API token for higher rate limits.
- **Real-Time Dataset Push** – Pushes results to Apify Dataset with metadata.
- **Timestamp Recording** – Records scrape timestamp for audit trails.
- **Error Handling** – Graceful error handling with detailed logging.
- **Asyncio-Friendly** – Non-blocking async/await architecture.

---

### 🌍 Data Sources

#### **1. PyPI (Python Package Index)**

- **Coverage**: 500,000+ Python packages
- **Search**: Text-based package search
- **Metrics**: Download count, version info
- **Data**: Author, license, requirements, keywords
- **URL Format**: https://pypi.org/project/{name}/

#### **2. npm Registry**

- **Coverage**: 2,000,000+ JavaScript packages
- **Search**: Full-text search with pagination
- **Metrics**: Monthly downloads, version info
- **Data**: Author, maintainers, keywords, license
- **URL Format**: https://www.npmjs.com/package/{name}

#### **3. VS Code Marketplace**

- **Coverage**: 50,000+ extensions
- **Search**: Extension search via official API
- **Metrics**: Install count, rating (0-5), rating count
- **Data**: Publisher, categories, keywords, updated date
- **URL Format**: https://marketplace.visualstudio.com/items?itemName={publisher}.{name}

#### **4. GitHub**

- **Coverage**: 200,000,000+ repositories
- **Search**: Repository search via GitHub API
- **Metrics**: Stars, forks, open issues
- **Data**: Owner, license, topics, language, created date
- **URL Format**: https://github.com/{owner}/{repo}

---

### ⚙️ How It Works

The Developer Tools Scraper accepts a keyword and searches across multiple **Developer Tools** sources simultaneously. It uses concurrent fetching with ThreadPoolExecutor to query PyPI, npm, VS Code Marketplace, and GitHub in parallel. Each source returns **Developer Tools** items which are then enriched with additional metadata through follow-up API calls. Results are deduplicated and pushed to the Apify Dataset.

**Key Processing Steps:**

1. **Input Parsing** – Accept keyword and source selection
2. **Proxy Setup** – Configure Apify residential proxy if available
3. **Session Creation** – Create HTTP session with headers
4. **Concurrent Source Queries** – Launch 4 concurrent fetch tasks
5. **PyPI Search** – Search and scrape PyPI packages
6. **npm Search** – Query npm registry API
7. **VS Code Search** – Search VS Code Marketplace API
8. **GitHub Search** – Query GitHub repositories API
9. **Metadata Enrichment** – Fetch additional details for items
10. **Data Aggregation** – Combine results from all sources
11. **Deduplication** – Remove duplicate items by source+name
12. **Result Formatting** – Format as structured dataset records
13. **Dataset Push** – Push individual records to Apify Dataset
14. **Completion** – Log summary statistics

**Key Benefits:**

- Discover **Developer Tools** across multiple platforms
- Compare packages across ecosystems
- Find tools for specific use cases
- Evaluate tool popularity and maturity
- Track developer tool trends
- Research alternatives and competitors

---

### 📥 Input

The Actor accepts the following input parameters:

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `keyword` | string | required | Search keyword for **Developer Tools** discovery |
| `sources` | array | `["pypi","npm","vscode","github"]` | Sources to search: "pypi", "npm", "vscode", "github" |
| `maxPages` | integer | `3` | Maximum pages per source (for paginated sources) |
| `useApifyProxy` | boolean | `true` | Enable Apify residential proxies |
| `apifyProxyGroups` | array | `["RESIDENTIAL"]` | Proxy group configuration |

**Example Input:**

```json
{
  "keyword": "testing framework",
  "sources": ["pypi", "npm", "vscode", "github"],
  "maxPages": 3,
  "useApifyProxy": true
}
````

**Python-Only Example:**

```json
{
  "keyword": "async",
  "sources": ["pypi", "github"],
  "maxPages": 2
}
```

**JavaScript-Only Example:**

```json
{
  "keyword": "react",
  "sources": ["npm", "vscode"],
  "maxPages": 3
}
```

***

### 📤 Output

The Actor pushes **Developer Tools** records with the following structure:

| Field | Type | Description |
|-------|------|-------------|
| `source` | string | Source platform (PyPI, npm, VS Code, GitHub) |
| `keyword` | string | Search keyword used |
| `name` | string | **Developer Tools** package/extension/repo name |
| `version` | string | Current version or latest release |
| `description` | string | Tool description or summary |
| `author` | string | Author, publisher, or repository owner |
| `author_email` | string | Author email if available |
| `contributors` | string | Additional contributors |
| `license` | string | License type (MIT, Apache 2.0, etc.) |
| `homepage` | string | Project homepage or repository URL |
| `downloads` | string | Download/installation/star count |
| `created` | string | Creation or initial release date |
| `requires_python` | string | Python version requirement (PyPI) |
| `keywords` | string | Keywords or tags |
| `url` | string | Direct link to **Developer Tools** page |
| `scraped_at` | string | ISO 8601 scrape timestamp |
| `rating` | string | Rating 0-5 (VS Code only) |
| `categories` | string | Categories (VS Code only) |
| `language` | string | Programming language (GitHub only) |
| `forks` | string | Fork count (GitHub only) |

**Example Output Record (PyPI):**

```json
{
  "source": "PyPI",
  "keyword": "testing framework",
  "name": "pytest",
  "version": "7.4.2",
  "description": "pytest: simple powerful testing with Python",
  "author": "Holger Krekel",
  "author_email": "holger@pytest.org",
  "license": "MIT",
  "homepage": "https://docs.pytest.org/en/stable/",
  "downloads": "",
  "created": "2023-09-20",
  "requires_python": ">=3.7",
  "keywords": "test, unittest, pytest",
  "url": "https://pypi.org/project/pytest/",
  "scraped_at": "2025-02-14T12:00:00"
}
```

**Example Output Record (npm):**

```json
{
  "source": "npm",
  "keyword": "react",
  "name": "react",
  "version": "18.2.0",
  "description": "React is a JavaScript library for building user interfaces.",
  "author": "Facebook",
  "author_email": "opensource@fb.com",
  "contributors": "Dan Abramov, Jordan Walke, Sophie Alpert",
  "license": "MIT",
  "homepage": "https://react.dev/",
  "downloads": "20485392",
  "created": "2023-08-15",
  "keywords": "react, javascript, frontend, ui",
  "url": "https://www.npmjs.com/package/react",
  "scraped_at": "2025-02-14T12:00:00"
}
```

**Example Output Record (VS Code):**

```json
{
  "source": "VS Code Marketplace",
  "keyword": "python",
  "name": "ms-python.python",
  "version": "2024.2.1",
  "description": "IntelliSense (Pylance), linting, debugging, testing, formatting, refactoring, variable explorer, test explorer, code navigation, and more.",
  "author": "Microsoft",
  "homepage": "https://marketplace.visualstudio.com/items?itemName=ms-python.python",
  "downloads": "45000000",
  "created": "2023-02-14",
  "keywords": "python, linting, debugging, testing",
  "rating": "4.8",
  "rating_count": "8932",
  "categories": "Programming Languages, Linters, Debuggers",
  "url": "https://marketplace.visualstudio.com/items?itemName=ms-python.python",
  "scraped_at": "2025-02-14T12:00:00"
}
```

**Example Output Record (GitHub):**

```json
{
  "source": "GitHub",
  "keyword": "cli",
  "name": "cli/cli",
  "description": "GitHub's official command line tool",
  "author": "cli",
  "license": "MIT",
  "homepage": "https://github.com/cli/cli",
  "downloads": "42000",
  "created": "2020-02-07",
  "keywords": "cli, github, command-line",
  "language": "Go",
  "forks": "3200",
  "open_issues": "87",
  "url": "https://github.com/cli/cli",
  "scraped_at": "2025-02-14T12:00:00"
}
```

***

### 🧰 Technical Stack

- **HTTP Requests:** requests library with session management
- **APIs:** PyPI JSON, npm Registry, VS Code Marketplace, GitHub REST
- **HTML Parsing:** BeautifulSoup4 for PyPI scraping
- **Concurrent Execution:** ThreadPoolExecutor for parallel fetching
- **Async:** asyncio for Actor integration
- **Proxy:** Apify Proxy with RESIDENTIAL configuration
- **Logging:** Apify Actor logging system
- **Platform:** Apify Actor serverless environment
- **Timeouts:** 8-15 seconds per API request

***

### 📊 Data Fields Explained

#### **Tool Identification**

- **source**: Which platform (PyPI, npm, VS Code, GitHub)
- **name**: Official package/extension/repo name
- **keyword**: Search keyword used

#### **Metadata**

- **version**: Current or latest version
- **description**: Tool description/summary
- **homepage**: Official website or repo

#### **Author Information**

- **author**: Creator/publisher/maintainer name
- **author\_email**: Author contact email
- **contributors**: Additional team members

#### **Licensing & Legal**

- **license**: License type for usage

#### **Engagement Metrics**

- **downloads**: Downloads/installs/stars count
- **rating**: Quality rating (VS Code)
- **rating\_count**: Number of ratings

#### **Technical Details**

- **requires\_python**: Python version needs
- **language**: Programming language (GitHub)
- **keywords**: Tags and categories

#### **Temporal**

- **created**: Creation date
- **scraped\_at**: When data was collected

***

### 🔄 Source Comparison

| Aspect | PyPI | npm | VS Code | GitHub |
|--------|------|-----|---------|--------|
| **Type** | Python Packages | JS Packages | Extensions | Repositories |
| **Coverage** | 500K+ | 2M+ | 50K+ | 200M+ |
| **Metrics** | Downloads | Monthly DL | Installs/Rating | Stars/Forks |
| **Language Focus** | Python | JavaScript | All | All |
| **Auth Info** | Author/Email | Publisher | Publisher | Owner |
| **License** | Included | Included | Limited | Included |
| **Search** | Web Scrape | JSON API | POST API | REST API |

***

### 🎯 Use Cases

- **Technology Stack Research** – Research **Developer Tools** for tech stack decisions
- **Package Comparison** – Compare packages across ecosystems
- **Dependency Analysis** – Analyze **Developer Tools** for projects
- **Tool Discovery** – Find new **Developer Tools** for specific needs
- **Trend Analysis** – Track **Developer Tools** popularity trends
- **Market Research** – Analyze developer tool ecosystem
- **Competitive Analysis** – Monitor competing **Developer Tools**
- **Alternative Finding** – Find alternatives to existing tools
- **Quality Assessment** – Evaluate **Developer Tools** maturity
- **Integration Planning** – Plan tool integrations for projects
- **Library Selection** – Choose libraries for development
- **Extension Curation** – Find VS Code extensions
- **Open Source Discovery** – Discover open source **Developer Tools**
- **Skill Development** – Find tools for learning
- **Vendor Evaluation** – Evaluate tool vendors and publishers

***

### 🚀 Quick Start

#### **1. Prepare Input**

Go to Apify Console and enter:

```json
{
  "keyword": "testing framework",
  "sources": ["pypi", "npm", "vscode", "github"],
  "maxPages": 3,
  "useApifyProxy": true
}
```

#### **2. Run the Actor**

Click **Start** button. The Actor will:

- Search PyPI, npm, VS Code, GitHub concurrently
- Enrich results with metadata
- Deduplicate across sources
- Push to Dataset

#### **3. Monitor Progress**

Console shows:

```
Keyword: 'testing framework' | Sources: ['pypi', 'npm', 'vscode', 'github']
Proxy active: RESIDENTIAL
[PyPI] Fetching...
[npm] Fetching...
[VS Code] Fetching...
[GitHub] Fetching...
  PyPI pages scraped: 45 packages
  npm packages found: 256
  VS Code extensions found: 18
  GitHub repos found: 89
Total unique items: 392
All done!
```

#### **4. View & Download Results**

- **Results Tab**: All **Developer Tools** records
- **Export**: JSON, CSV, Excel
- **Filter**: By source or language
- **Sort**: By downloads or rating

***

### ⚙️ Configuration

#### **Single Source**

Python only:

```json
{
  "keyword": "async",
  "sources": ["pypi"]
}
```

#### **Multiple Sources**

Python and JavaScript:

```json
{
  "keyword": "logging",
  "sources": ["pypi", "npm"]
}
```

#### **Page Limits**

Quick search (1 page):

```json
{
  "maxPages": 1
}
```

Comprehensive (5 pages):

```json
{
  "maxPages": 5
}
```

***

### 📈 Performance

#### **Processing Speed**

- \~30-60 seconds for all 4 sources
- \~100-200 tools discovered per search
- Concurrent fetching saves significant time
- Metadata enrichment adds ~10-20 seconds

#### **Resource Usage**

- Memory: ~80-150MB
- CPU: ~30-40% during concurrent fetching
- Network: ~2-5MB per search
- API calls: ~50-100 depending on sources

#### **Concurrency**

- 4 source fetchers running in parallel
- 20 metadata fetchers per source
- ThreadPoolExecutor for efficient threading

#### **Data Quality**

- **Completeness**: Results vary by source
- **Freshness**: Real-time data from APIs
- **Accuracy**: Reflects official source data
- **Deduplication**: Removes same-name duplicates
- **Verification**: Always verify with official sources

#### **Best Practices**

- Set reasonable page limits
- Use residential proxies
- Respect API rate limits
- Verify tool quality independently
- Check licenses before use
- Review security for critical tools
- Follow tool documentation
- Monitor for deprecation

***

***

### 📦 Changelog

#### v1.0.0 (February 2025)

**Initial Release:**

- PyPI package search and scraping
- npm registry API integration
- VS Code Marketplace API integration
- GitHub repository API search
- Multi-threaded concurrent fetching
- Metadata enrichment for all sources
- Author and contributor extraction
- License information extraction
- Download/star/rating metric collection
- Keyword and tag extraction
- Homepage and URL capture
- Version tracking
- Creation date recording
- Deduplication across sources
- Apify proxy support
- GitHub API token support
- Real-time Dataset push
- ISO 8601 timestamp recording
- Comprehensive error handling
- Detailed progress logging
- ThreadPoolExecutor for concurrency

***

### 🧑‍💻 Support & Feedback

- **Issues:** Submit via Apify console with keyword
- **Documentation:** Check Actor details page
- **Community:** Apify forum discussions
- **Feature Requests:** Suggest new sources or features
- **Bug Reports:** Include keyword and error details

***

### 💾 Apify Integration

#### **Automatic Features**

```python
## Concurrent source fetching
with ThreadPoolExecutor(max_workers=4) as ex:
    # All 4 sources fetched in parallel

## Real-time Dataset push
await Actor.push_data(item)

## Progress logging
Actor.log.info(f"  + {source} total: {len(results)}")
```

#### **Output Access**

- **Results Tab**: All **Developer Tools** records
- **Export**: JSON, CSV, Excel
- **Filter**: By source or language
- **API**: Query via Apify API

***

### 📄 License & Legal

**Terms of Use:**

- Use for legitimate research and development
- Respect all source ToS and rate limits
- Respect tool authors and publishers
- Don't republish without attribution
- Comply with applicable laws
- Use data ethically and responsibly

**Disclaimer:**
Developer Tools Scraper is provided as-is for research purposes. Users are responsible for ensuring compliance with all source ToS and applicable laws. Always verify tool information with official sources.

***

### 🎉 Get Started Today

**Deploy now for **Developer Tools** research!**

Use for:

- 🔍 Tool Discovery
- 📊 Market Research
- 💡 Stack Planning
- 🔄 Comparison
- 📈 Trend Analysis

**Perfect for:**

- Developers
- Tech Leads
- Product Managers
- DevOps Engineers
- Data Scientists

***

**Last Updated:** February 2025\
**Version:** 1.0.0\
**Status:** Production Ready\
**Platform:** Apify Actor\
**Architecture:** Async/Await + ThreadPoolExecutor\
**Sources:** 4 (PyPI, npm, VS Code, GitHub)\
**Concurrency:** Parallel multi-source fetching

***

### 📚 Related Tools

- Website Technology Stack Scraper
- Google Keyword Finder
- Open Router Model Scraper
- Skill Curator Scraper

**Your complete Apify-powered **Developer Tools** discovery solution!** 🚀✨

***

### 🛠️ Developer Tools Excellence

This Actor is optimized for **Developer Tools** discovery with:

- ✅ Multi-source aggregation (4 sources)
- ✅ Concurrent API fetching
- ✅ Metadata enrichment
- ✅ Intelligent deduplication
- ✅ Comprehensive field extraction
- ✅ Real-time Dataset integration
- ✅ Error recovery
- ✅ Production-ready code

**Discover developer tools effortlessly!** 💎🚀

# Actor input Schema

## `keyword` (type: `string`):

Keyword to search across all selected sources (e.g. 'web scraping', 'machine learning').

## `sources` (type: `array`):

Select which platforms to search.

## `maxPages` (type: `integer`):

How many pages to fetch from PyPI, VS Code, and GitHub (1–10).

## `useApifyProxy` (type: `boolean`):

Recommended to avoid rate limiting.

## `apifyProxyGroups` (type: `array`):

RESIDENTIAL works best for PyPI scraping.

## Actor input object example

```json
{
  "keyword": "web scraping",
  "sources": [
    "pypi",
    "npm",
    "vscode",
    "github"
  ],
  "maxPages": 3,
  "useApifyProxy": true,
  "apifyProxyGroups": [
    "RESIDENTIAL"
  ]
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keyword": "web scraping"
};

// Run the Actor and wait for it to finish
const run = await client.actor("datapilot/developer-tools-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "keyword": "web scraping" }

# Run the Actor and wait for it to finish
run = client.actor("datapilot/developer-tools-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keyword": "web scraping"
}' |
apify call datapilot/developer-tools-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=datapilot/developer-tools-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Developer Tools Scraper",
        "description": "Package & Developer Ecosystem Scraper collects package, extension, and repository data from PyPI, npm, VS Code Marketplace, and GitHub. Extracts names, versions, descriptions, authors, licenses, downloads, ratings, keywords, and URLs. Ideal for developer research, trend analysis, lead generation",
        "version": "0.0",
        "x-build-id": "XMU97Lz3Uuc5rxbiR"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/datapilot~developer-tools-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-datapilot-developer-tools-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/datapilot~developer-tools-scraper/runs": {
            "post": {
                "operationId": "runs-sync-datapilot-developer-tools-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/datapilot~developer-tools-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-datapilot-developer-tools-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "keyword"
                ],
                "properties": {
                    "keyword": {
                        "title": "Search Keyword",
                        "type": "string",
                        "description": "Keyword to search across all selected sources (e.g. 'web scraping', 'machine learning')."
                    },
                    "sources": {
                        "title": "Sources to scrape",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Select which platforms to search.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "pypi",
                                "npm",
                                "vscode",
                                "github"
                            ]
                        },
                        "default": [
                            "pypi",
                            "npm",
                            "vscode",
                            "github"
                        ]
                    },
                    "maxPages": {
                        "title": "Max pages per source",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "How many pages to fetch from PyPI, VS Code, and GitHub (1–10).",
                        "default": 3
                    },
                    "useApifyProxy": {
                        "title": "Use Apify Proxy",
                        "type": "boolean",
                        "description": "Recommended to avoid rate limiting.",
                        "default": true
                    },
                    "apifyProxyGroups": {
                        "title": "Proxy Groups",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "RESIDENTIAL works best for PyPI scraping.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "RESIDENTIAL",
                                "DATACENTER",
                                "GOOGLE"
                            ]
                        },
                        "default": [
                            "RESIDENTIAL"
                        ]
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
