# Hugging Face Papers Scraper (`parseforge/huggingface-papers-scraper`) Actor

Scrape AI and machine learning research papers from Hugging Face Papers. Get titles, abstracts, authors with affiliations, upvotes, publication dates, ArXiv IDs, and community discussion counts. Search by keyword or browse daily papers.

- **URL**: https://apify.com/parseforge/huggingface-papers-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Education, Developer tools, Other
- **Stats:** 7 total users, 3 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $9.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://raw.githubusercontent.com/ParseForge/apify-assets/main/banner.jpg)

## 📄 Hugging Face Papers Scraper

> 🚀 Scrape trending and keyword-searched AI/ML papers from Hugging Face with titles, abstracts, authors, upvotes, arXiv IDs, and GitHub repos. Returns structured data in seconds.

> 🕒 Last updated: 2026-04-23

Every day, Hugging Face Papers surfaces the most discussed machine learning research with community upvotes, author profiles, and links to code repositories. This Actor pulls that curated feed or runs keyword searches across the entire index, returning structured records with titles, abstracts, arXiv identifiers, author details, GitHub links, project pages, AI-generated keywords, and community engagement metrics.

Whether you run an AI newsletter, track a research subfield for your lab, or want to spot emerging trends before they go mainstream, this scraper saves you hours of manual browsing. Set it on a daily schedule and let it build a living archive of the papers that matter to your work.

| Target | Hugging Face Papers |
|--------|-------------------------------|
| Use Cases | Research newsletters, literature reviews, ML trend tracking, academic monitoring |

---

### 📋 What it does

- 📚 **Paper metadata.** Titles, abstracts, arXiv IDs, publication dates, and direct Hugging Face URLs for every paper.
- 👥 **Author details.** Full author lists with Hugging Face usernames and verification status included.
- ⭐ **Community engagement.** Upvote counts, comment totals, and thumbnails so you can gauge which papers resonate.
- 💻 **Code and project links.** GitHub repository URLs and project pages when authors have linked them.
- 🔍 **Two collection modes.** Search by keyword across indexed papers or grab today's trending daily feed.

Each record includes the arXiv ID, paper title, abstract, publication date, full author list with HF handles, upvote and comment counts, thumbnail image, GitHub repo link, project page, and AI-generated keywords.

> 💡 **Why it matters:** Manually checking Hugging Face Papers every day and copying metadata into a spreadsheet takes 30+ minutes. This Actor does it in seconds and delivers a clean, structured dataset ready for analysis.

---

### 🎬 Full Demo

_🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset._

---

### ⚙️ Input

<table>
<thead>
<tr><th>Input</th><th>Type</th><th>Default</th><th>Behavior</th></tr>
</thead>
<tbody>
<tr><td>searchQuery</td><td>string</td><td>"transformer"</td><td>Keyword to match against paper titles and abstracts. Examples: "diffusion model", "LLM", "reinforcement learning".</td></tr>
<tr><td>mode</td><td>string</td><td>"search"</td><td>Collection mode. Use "search" for keyword search or "trending" for the daily curated feed.</td></tr>
<tr><td>maxItems</td><td>integer</td><td>10</td><td>Maximum number of papers to return. Free users are limited to 10. Paid users can request up to 1,000,000.</td></tr>
</tbody>
</table>

**Example: Search for diffusion model papers.**

```json
{
    "searchQuery": "diffusion model",
    "mode": "search",
    "maxItems": 50
}
````

**Example: Grab today's trending papers.**

```json
{
    "mode": "trending",
    "maxItems": 25
}
```

> ⚠️ **Good to Know:** Hugging Face Papers indexes new publications daily. Trending mode returns papers curated by the HF team and community for the current day. Search mode queries across all indexed papers. Results are limited by what the Hugging Face API exposes.

***

### 📊 Output

Each record contains **15+ fields**. Download as CSV, Excel, JSON, or XML.

#### 🧾 Schema

| Field | Type | Example |
|---|---|---|
| 🆔 arxivId | string | `"2404.12345"` |
| 📋 title | string | `"Efficient Attention for Long-Context Language Models"` |
| 🔗 url | string | `"https://huggingface.co/papers/2404.12345"` |
| 🔗 arxivUrl | string | `"https://arxiv.org/abs/2404.12345"` |
| 📅 publishedAt | string | `"2026-04-09"` |
| ⬆️ upvotes | integer | `187` |
| 💬 numComments | integer | `12` |
| 👤 firstAuthor | string | `"Jane Smith"` |
| 👥 authors | array | `[{"name": "Jane Smith", "hfUser": "jsmith", "verified": true}]` |
| 📝 summary | string | `"We introduce a novel attention mechanism..."` |
| 💻 githubRepo | string | `"https://github.com/example/long-attention"` |
| 🌐 projectPage | string | `"https://example.github.io/long-attention"` |
| 🏷️ aiKeywords | array | `["attention", "long-context", "efficiency"]` |
| 🖼️ thumbnail | string | `"https://cdn-thumbnails.huggingface.co/..."` |
| 🕐 scrapedAt | string | `"2026-04-10T12:00:00.000Z"` |

#### 📦 Sample records

<details>
<summary><strong>📚 Trending paper with high upvotes</strong></summary>

```json
{
    "arxivId": "2404.18901",
    "title": "Scaling Sparse Mixture-of-Experts to 128 GPUs",
    "url": "https://huggingface.co/papers/2404.18901",
    "arxivUrl": "https://arxiv.org/abs/2404.18901",
    "publishedAt": "2026-04-08",
    "upvotes": 342,
    "numComments": 28,
    "numAuthors": 4,
    "firstAuthor": "Wei Chen",
    "authors": [
        {"name": "Wei Chen", "hfUser": "weichen", "verified": true},
        {"name": "Alex Kim", "hfUser": "alexk", "verified": false}
    ],
    "summary": "We present a distributed training framework for sparse MoE models that achieves near-linear scaling across 128 GPUs with minimal communication overhead.",
    "thumbnail": "https://cdn-thumbnails.huggingface.co/social/papers/2404.18901.png",
    "githubRepo": "https://github.com/weichen-lab/sparse-moe-128",
    "projectPage": null,
    "aiKeywords": ["mixture-of-experts", "distributed training", "scalability"],
    "aiSummary": "A distributed training framework for sparse MoE models with near-linear GPU scaling.",
    "scrapedAt": "2026-04-10T14:30:00.000Z"
}
```

</details>

<details>
<summary><strong>🔍 Search result for "diffusion model"</strong></summary>

```json
{
    "arxivId": "2404.15677",
    "title": "Fast Sampling for Text-to-Image Diffusion with Progressive Distillation",
    "url": "https://huggingface.co/papers/2404.15677",
    "arxivUrl": "https://arxiv.org/abs/2404.15677",
    "publishedAt": "2026-04-06",
    "upvotes": 89,
    "numComments": 5,
    "numAuthors": 7,
    "firstAuthor": "Maria Lopez",
    "authors": [
        {"name": "Maria Lopez", "hfUser": "mlopez", "verified": true}
    ],
    "summary": "We propose a progressive distillation technique that reduces the number of sampling steps from 50 to 4 while maintaining image quality.",
    "thumbnail": "https://cdn-thumbnails.huggingface.co/social/papers/2404.15677.png",
    "githubRepo": "https://github.com/mlopez/fast-diffusion-distill",
    "projectPage": "https://fast-diffusion.github.io",
    "aiKeywords": ["diffusion models", "distillation", "text-to-image"],
    "aiSummary": "Progressive distillation for fast sampling in text-to-image diffusion models.",
    "scrapedAt": "2026-04-10T14:32:00.000Z"
}
```

</details>

<details>
<summary><strong>📄 Paper without linked code</strong></summary>

```json
{
    "arxivId": "2404.11234",
    "title": "A Survey of Multimodal Reasoning in Large Language Models",
    "url": "https://huggingface.co/papers/2404.11234",
    "arxivUrl": "https://arxiv.org/abs/2404.11234",
    "publishedAt": "2026-04-04",
    "upvotes": 56,
    "numComments": 3,
    "numAuthors": 2,
    "firstAuthor": "Priya Gupta",
    "authors": [
        {"name": "Priya Gupta", "hfUser": null, "verified": false},
        {"name": "Raj Patel", "hfUser": "rajpatel", "verified": true}
    ],
    "summary": "This survey categorizes 150+ papers on multimodal reasoning, covering vision-language, audio-text, and cross-modal transfer learning approaches.",
    "thumbnail": null,
    "githubRepo": null,
    "projectPage": null,
    "aiKeywords": ["survey", "multimodal", "reasoning", "LLM"],
    "aiSummary": null,
    "scrapedAt": "2026-04-10T14:35:00.000Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 📚 | **Two collection modes.** Search by keyword or pull the daily trending feed. |
| ⚡ | **Fast results.** Papers arrive in seconds, not minutes of manual browsing. |
| 👥 | **Author metadata.** Hugging Face usernames and verification status for every author. |
| 💻 | **Code links included.** GitHub repos and project pages extracted automatically. |
| 🏷️ | **AI keywords.** Machine-generated topic tags for easier filtering and categorization. |
| 📅 | **Schedule-ready.** Set it on a daily cron to build a rolling archive of ML research. |
| 📊 | **Multiple export formats.** Download results as CSV, Excel, JSON, or XML. |

> Hugging Face Papers features hundreds of new AI/ML papers every week, curated by the community and surfaced through upvotes. Staying current manually is a full-time job.

***

### 📈 How it compares to alternatives

| Approach | Cost | Coverage | Refresh | Setup |
|---|---|---|---|---|
| **⭐ Hugging Face Papers Scraper** *(this Actor)* | $5 free credit, then pay-per-use | All HF indexed papers | **Live per run** | ⚡ 2 min |
| Manual browsing | Free | Limited by time | Manual daily checks | 🕐 30 min/day |
| Official API integration | Free | Full access | Per request | 🔧 1-2 hours |
| Third-party data providers | $50-500/mo | Varies | Weekly or monthly | 📋 30 min |

Pick this Actor when you want structured, schedule-ready paper data without writing API integration code yourself.

***

### 🚀 How to use

1. 📝 **Sign up.** [Create a free account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp) (takes 2 minutes).
2. 🌐 **Open the Actor.** Go to the Hugging Face Papers Scraper page on the Apify Store.
3. 🎯 **Set input.** Choose a keyword and mode (search or trending), then set your max items.
4. 🚀 **Run it.** Click **Start** and let the Actor collect your data.
5. 📥 **Download.** Grab your results in the **Dataset** tab as CSV, Excel, JSON, or XML.

> ⏱️ Total time from signup to downloaded dataset: **3-5 minutes.** No coding required.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%" valign="top">

#### 📬 Research Newsletters

- Auto-curate weekly digests of trending ML papers
- Filter by keyword to match your audience's interests
- Include upvote counts to highlight community favorites
- Link directly to arXiv and GitHub repos

</td>
<td width="50%" valign="top">

#### 🧠 Academic Research

- Monitor new publications in your subfield daily
- Build literature review datasets without manual searches
- Track author output and collaboration patterns
- Export to spreadsheets for bibliometric analysis

</td>
</tr>
<tr>
<td width="50%" valign="top">

#### 📊 Trend Analysis

- Track which ML topics gain upvotes over time
- Spot emerging research areas before they peak
- Compare engagement across diffusion, LLM, and RL papers
- Build time-series datasets of publication volume

</td>
<td width="50%" valign="top">

#### 💼 Talent Scouting

- Identify active researchers by watching trending authors
- Find engineers who open-source their paper code
- Monitor verified Hugging Face contributors
- Build prospect lists for recruiting outreach

</td>
</tr>
</table>

***

***

### 🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Empirical datasets for papers, thesis work, and coursework
- Longitudinal studies tracking changes across snapshots
- Reproducible research with cited, versioned data pulls
- Classroom exercises on data analysis and ethical scraping

</td>
<td width="50%">

#### 🎨 Personal and creative

- Side projects, portfolio demos, and indie app launches
- Data visualizations, dashboards, and infographics
- Content research for bloggers, YouTubers, and podcasters
- Hobbyist collections and personal trackers

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Transparency reporting and accountability projects
- Advocacy campaigns backed by public-interest data
- Community-run databases for local issues
- Investigative journalism on public records

</td>
<td width="50%">

#### 🧪 Experimentation

- Prototype AI and machine-learning pipelines with real data
- Validate product-market hypotheses before engineering spend
- Train small domain-specific models on niche corpora
- Test dashboard concepts with live input

</td>
</tr>
</table>

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20Hugging%20Face%20Papers%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20Hugging%20Face%20Papers%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20Hugging%20Face%20Papers%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20Hugging%20Face%20Papers%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

### ❓ Frequently Asked Questions

<details>
<summary><b>💳 Do I need a paid Apify plan to run this actor?</b></summary>

No. You can start right now on the free Apify plan, which includes **$5 in free monthly credit**. That is enough to run this actor several times and explore the output before committing to anything. Paid plans unlock higher limits, more concurrent runs, and larger datasets. [Create a free Apify account here](https://console.apify.com/sign-up?fpr=vmoqkp) to get started.

</details>

<details>
<summary><b>🚨 What happens if my run fails or returns no results?</b></summary>

Failed runs are not charged. If the source site changes, proxies get rate-limited, or a specific input matches nothing, re-run the actor or open our [contact form](https://tally.so/r/BzdKgA) and we will investigate. You can also check the run log in the Apify console to see why the run stopped.

</details>

<details>
<summary><b>📏 How many items can I scrape per run?</b></summary>

Free users are limited to **10 items per run** so you can preview the output and confirm the actor works for your use case. Paid users can raise maxItems up to **1,000,000** per run. [Upgrade here](https://console.apify.com/sign-up?fpr=vmoqkp) if you need full scale.

</details>

<details>
<summary><b>🕒 How fresh is the data?</b></summary>

Every run fetches live data at the moment of execution. There is no cache or delay: the records you get reflect what the source returned at that moment. Schedule the actor to maintain a rolling snapshot of the data you need.

</details>

<details>
<summary><b>🧑‍💻 Can I call this actor from my own code?</b></summary>

Yes. Apify exposes every actor as a REST endpoint and ships first-class SDKs for [Node.js](https://docs.apify.com/sdk/js) and [Python](https://docs.apify.com/sdk/python). You can start a run, read the dataset, and handle webhooks from your own app in a few lines. All you need is your Apify API token.

</details>

<details>
<summary><b>📤 How do I export the data?</b></summary>

Every Apify dataset can be downloaded in one click from the console as CSV, JSON, JSONL, Excel, HTML, XML, or RSS. You can also pull results programmatically via the [Apify API](https://docs.apify.com/api/v2) or stream them into BigQuery, S3, and other destinations through built-in integrations.

</details>

<details>
<summary><b>📅 Can I schedule the actor to run automatically?</b></summary>

Yes. Use the Apify scheduler to run the actor on any cadence, from hourly to monthly. Results are saved to your dataset and can be delivered to webhooks, email, Slack, cloud storage, or automation tools such as Zapier and Make.

***

</details>

### 🔌 Automating Hugging Face Papers Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

- 🟢 **Node.js.** Install the apify-client NPM package.
- 🐍 **Python.** Use the apify-client PyPI package.
- 📚 See the [Apify API documentation](https://docs.apify.com/api/v2) for full details.

The [Apify Schedules feature](https://docs.apify.com/platform/schedules) lets you trigger this Actor on any cron interval. Set a daily run in trending mode and never miss the papers the community is talking about.

### 🔌 Integrate with any app

Hugging Face Papers Scraper connects to any cloud service via [Apify integrations](https://apify.com/integrations):

- [**Make**](https://docs.apify.com/platform/integrations/make) - Automate multi-step workflows
- [**Zapier**](https://docs.apify.com/platform/integrations/zapier) - Connect with 5,000+ apps
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - Get run notifications
- [**Airbyte**](https://docs.apify.com/platform/integrations/airbyte) - Pipe data into your warehouse
- [**GitHub**](https://docs.apify.com/platform/integrations/github) - Trigger runs from commits
- [**Google Drive**](https://docs.apify.com/platform/integrations/drive) - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes.

***

### 🔗 Recommended Actors

- [**🤖 Hugging Face Model Scraper**](https://apify.com/parseforge/hugging-face-model-scraper) - Collect AI model metadata, downloads, and tags from the HF Hub
- [**🍎 Apple App Store Scraper**](https://apify.com/parseforge/apple-app-store-iphone-scraper) - Scrape iPhone app listings, ratings, and reviews
- [**📰 PR Newswire Scraper**](https://apify.com/parseforge/pr-newswire-scraper) - Collect press releases and corporate news
- [**🏪 AWS Marketplace Scraper**](https://apify.com/parseforge/aws-marketplace-scraper) - Extract AWS product listings and pricing
- [**🔗 Stripe App Marketplace Scraper**](https://apify.com/parseforge/stripe-marketplace-scraper) - Scrape Stripe app listings and integrations

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more data scrapers and tools.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) to request a new scraper, propose a custom data project, or report an issue.

***

> **⚠️ Disclaimer:** this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Hugging Face or arXiv. All trademarks mentioned are the property of their respective owners. Only publicly available data is collected.

# Actor input Schema

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `searchQuery` (type: `string`):

Search papers by keyword. Example: 'transformer', 'diffusion model', 'LLM'.

## `mode` (type: `string`):

Search or trending.

## Actor input object example

```json
{
  "maxItems": 10,
  "searchQuery": "transformer",
  "mode": "search"
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10,
    "searchQuery": "transformer"
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/huggingface-papers-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "maxItems": 10,
    "searchQuery": "transformer",
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/huggingface-papers-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10,
  "searchQuery": "transformer"
}' |
apify call parseforge/huggingface-papers-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/huggingface-papers-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hugging Face Papers Scraper",
        "description": "Scrape AI and machine learning research papers from Hugging Face Papers. Get titles, abstracts, authors with affiliations, upvotes, publication dates, ArXiv IDs, and community discussion counts. Search by keyword or browse daily papers.",
        "version": "1.0",
        "x-build-id": "a34yyp6cEa7Z6iR56"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~huggingface-papers-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-huggingface-papers-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~huggingface-papers-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-huggingface-papers-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~huggingface-papers-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-huggingface-papers-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search papers by keyword. Example: 'transformer', 'diffusion model', 'LLM'."
                    },
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "search",
                            "trending"
                        ],
                        "type": "string",
                        "description": "Search or trending.",
                        "default": "search"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
