# Semantic Scholar Scraper (`parseforge/semantic-scholar-scraper`) Actor

Extract detailed academic paper data from Semantic Scholar, including abstracts, citations, authors, and publication details. Ideal for researchers, academics, and analysts who need structured scholarly data for literature reviews, research workflows, and large-scale academic analysis.

- **URL**: https://apify.com/parseforge/semantic-scholar-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Automation, Developer tools, Other
- **Stats:** 36 total users, 2 monthly users, 100.0% runs succeeded, 2 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 📚 Semantic Scholar Scraper

> 🚀 **Collect academic paper data from Semantic Scholar in minutes.** Search by keyword, author, venue, or year range. Export titles, abstracts, citations, authors, and PDF links. No coding, no API key required.

> 🕒 **Last updated:** 2026-04-23 · **📊 20+ fields** per paper · **🔍 6 search filters** · **📄 PDF availability** · **🚫 No auth** required

The **Semantic Scholar Scraper** collects academic paper data from Semantic Scholar, returning **20+ fields per paper**: title, abstract, authors, citation count, reference count, year, venue, DOI, PDF URL, and paper URL. Filter by keyword, author, venue, year range, and PDF availability. Runs support up to 1,000,000 papers on a paid plan.

Semantic Scholar indexes over 200 million academic papers. This Actor queries its database with 6 filters and returns structured results ready for literature reviews, citation analysis, or research dashboards.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Academic researchers, data scientists, R&D teams, librarians, science journalists, bibliometric analysts | Literature reviews, citation analysis, research trend tracking, author profiling, venue benchmarking |

---

### 📋 What the Semantic Scholar Scraper does

Six search filters:

- 🔍 **Keyword search.** Free-text search across titles and abstracts.
- 🔗 **URL mode.** Paste a direct Semantic Scholar search URL.
- 👤 **Author filter.** Search by author name.
- 📅 **Year range.** Min and max publication year.
- 📄 **PDF filter.** Only papers with available PDFs.
- 🏛️ **Venue filter.** Conference or journal name.

Each paper record includes title, abstract, authors (with IDs), citation count, reference count, year, venue, DOI, fields of study, PDF URL, and Semantic Scholar URL.

> 💡 **Why it matters:** searching for papers one at a time on Semantic Scholar or Google Scholar is slow and doesn't support bulk export. This Actor downloads structured academic data at scale for systematic reviews, bibliometric analysis, or research intelligence.

---

### 🎬 Full Demo

_🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset._

---

### ⚙️ Input

<table>
<thead>
<tr><th>Input</th><th>Type</th><th>Default</th><th>Behavior</th></tr>
</thead>
<tbody>
<tr><td>searchQuery</td><td>string</td><td>""</td><td>Keyword search across titles and abstracts.</td></tr>
<tr><td>startUrl</td><td>string</td><td>""</td><td>Direct Semantic Scholar URL.</td></tr>
<tr><td>author</td><td>string</td><td>""</td><td>Author name filter.</td></tr>
<tr><td>yearMin</td><td>integer</td><td>-</td><td>Minimum publication year.</td></tr>
<tr><td>yearMax</td><td>integer</td><td>-</td><td>Maximum publication year.</td></tr>
<tr><td>hasPdf</td><td>boolean</td><td>false</td><td>Only papers with available PDFs.</td></tr>
<tr><td>venues</td><td>array</td><td>[]</td><td>Conference or journal names.</td></tr>
<tr><td>maxItems</td><td>integer</td><td>10</td><td>Max papers. Free: limited. Paid: up to 1,000,000.</td></tr>
</tbody>
</table>

**Example: recent AI papers with PDFs available.**

```json
{
    "searchQuery": "large language models",
    "yearMin": 2024,
    "hasPdf": true,
    "maxItems": 100
}
````

**Example: papers by a specific author.**

```json
{
    "author": "Yoshua Bengio",
    "maxItems": 50
}
```

***

### 📊 Output

Each paper record contains **20+ fields**. Download the dataset as CSV, Excel, JSON, or XML.

#### 🧾 Schema

| Field | Type | Example |
|---|---|---|
| 📝 title | string | `"Attention Is All You Need"` |
| 📄 abstract | string | `"We propose a new simple network..."` |
| 👤 authors | array | `[{ name, authorId }]` |
| 📊 citationCount | number | `95000` |
| 📚 referenceCount | number | `38` |
| 📅 year | number | `2017` |
| 🏛️ venue | string | `"NeurIPS"` |
| 🔗 doi | string | `"10.5555/3295222.3295349"` |
| 📂 fieldsOfStudy | array | `["Computer Science"]` |
| 📄 pdfUrl | string | null | `"https://arxiv.org/pdf/1706.03762"` |
| 🔗 url | string | `"https://www.semanticscholar.org/paper/..."` |
| 🕒 scrapedAt | ISO 8601 | `"2026-04-16T00:00:00.000Z"` |

#### 📦 Sample records

<details>
<summary><strong>📄 Highly cited paper with PDF</strong></summary>

```json
{
    "title": "Attention Is All You Need",
    "abstract": "The dominant sequence transduction models are based on complex recurrent...",
    "authors": [{ "name": "Ashish Vaswani", "authorId": "1234" }, { "name": "Noam Shazeer", "authorId": "5678" }],
    "citationCount": 95000,
    "referenceCount": 38,
    "year": 2017,
    "venue": "NeurIPS",
    "doi": "10.5555/3295222.3295349",
    "fieldsOfStudy": ["Computer Science"],
    "pdfUrl": "https://arxiv.org/pdf/1706.03762",
    "url": "https://www.semanticscholar.org/paper/Attention-Is-All-You-Need",
    "scrapedAt": "2026-04-16T00:00:00.000Z"
}
```

</details>

<details>
<summary><strong>📚 Recent paper with sparse citations</strong></summary>

```json
{
    "title": "A Survey of LLM Reasoning",
    "abstract": "This paper surveys recent advances in reasoning...",
    "authors": [{ "name": "Jane Researcher", "authorId": "9999" }],
    "citationCount": 12,
    "referenceCount": 85,
    "year": 2026,
    "venue": "ACL",
    "doi": null,
    "fieldsOfStudy": ["Computer Science", "Linguistics"],
    "pdfUrl": "https://arxiv.org/pdf/2602.12345",
    "url": "https://www.semanticscholar.org/paper/A-Survey-of-LLM-Reasoning",
    "scrapedAt": "2026-04-16T00:00:00.000Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 📚 | **200M+ papers indexed.** Full Semantic Scholar database. |
| 🔍 | **6 search filters.** Keyword, author, year, venue, PDF, and URL. |
| 📊 | **Citation and reference counts.** Quantitative impact metrics. |
| 📄 | **PDF links.** Direct download URLs when available. |
| 👤 | **Author profiles.** Name and Semantic Scholar ID per author. |
| ⚡ | **Scalable.** From single-paper lookups to full topic sweeps. |
| 🚫 | **No authentication.** No API key needed. |

***

***

### 📈 How it compares to alternatives

| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| **⭐ Semantic Scholar Scraper** *(this Actor)* | $5 free credit, then pay-per-use | 200M+ papers | **Live per run** | keyword, author, year, venue, PDF | ⚡ 2 min |
| Semantic Scholar API (direct) | Free with rate limits | Full | Real-time | Many | ⏳ Hours (API setup) |
| Google Scholar | Free | Broad | Manual | Limited | 🕒 Per search |
| Paid academic databases | $1,000-50,000/year | Multi-source | Varies | Many | 🐢 Weeks |

Pick this Actor when you want academic paper metadata on demand, with filters, without writing API client code.

***

### 🚀 How to use

1. 📝 **Sign up.** [Create a free account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp) (takes 2 minutes).
2. 🌐 **Open the Actor.** Go to the Semantic Scholar Scraper page on the Apify Store.
3. 🎯 **Set input.** Enter a search query, author, or year range.
4. 🚀 **Run it.** Click **Start**.
5. 📥 **Download.** Grab results in the **Dataset** tab.

> ⏱️ Total time: **3-5 minutes.** No coding required.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%" valign="top">

#### 📊 Literature Reviews & Bibliometrics

- Build systematic review datasets
- Analyze citation networks by topic
- Track research trends over time
- Compare venue impact by field

</td>
<td width="50%" valign="top">

#### 🏢 R\&D & Industry Research

- Monitor competitor publications
- Track emerging technologies by keyword
- Build prior-art search databases
- Identify expert authors by citation count

</td>
</tr>
</table>

***

***

### 🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Empirical datasets for papers, thesis work, and coursework
- Longitudinal studies tracking changes across snapshots
- Reproducible research with cited, versioned data pulls
- Classroom exercises on data analysis and ethical scraping

</td>
<td width="50%">

#### 🎨 Personal and creative

- Side projects, portfolio demos, and indie app launches
- Data visualizations, dashboards, and infographics
- Content research for bloggers, YouTubers, and podcasters
- Hobbyist collections and personal trackers

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Transparency reporting and accountability projects
- Advocacy campaigns backed by public-interest data
- Community-run databases for local issues
- Investigative journalism on public records

</td>
<td width="50%">

#### 🧪 Experimentation

- Prototype AI and machine-learning pipelines with real data
- Validate product-market hypotheses before engineering spend
- Train small domain-specific models on niche corpora
- Test dashboard concepts with live input

</td>
</tr>
</table>

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20Semantic%20Scholar%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20Semantic%20Scholar%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20Semantic%20Scholar%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20Semantic%20Scholar%20Scraper%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

### ❓ Frequently Asked Questions

<details>
<summary><b>💳 Do I need a paid Apify plan to run this actor?</b></summary>

No. You can start right now on the free Apify plan, which includes **$5 in free monthly credit**. That is enough to run this actor several times and explore the output before committing to anything. Paid plans unlock higher limits, more concurrent runs, and larger datasets. [Create a free Apify account here](https://console.apify.com/sign-up?fpr=vmoqkp) to get started.

</details>

<details>
<summary><b>🚨 What happens if my run fails or returns no results?</b></summary>

Failed runs are not charged. If the source site changes, proxies get rate-limited, or a specific input matches nothing, re-run the actor or open our [contact form](https://tally.so/r/BzdKgA) and we will investigate. You can also check the run log in the Apify console to see why the run stopped.

</details>

<details>
<summary><b>📏 How many items can I scrape per run?</b></summary>

Free users are limited to **10 items per run** so you can preview the output and confirm the actor works for your use case. Paid users can raise maxItems up to **1,000,000** per run. [Upgrade here](https://console.apify.com/sign-up?fpr=vmoqkp) if you need full scale.

</details>

<details>
<summary><b>🕒 How fresh is the data?</b></summary>

Every run fetches live data at the moment of execution. There is no cache or delay: the records you get reflect what the source returned at that moment. Schedule the actor to maintain a rolling snapshot of the data you need.

</details>

<details>
<summary><b>🧑‍💻 Can I call this actor from my own code?</b></summary>

Yes. Apify exposes every actor as a REST endpoint and ships first-class SDKs for [Node.js](https://docs.apify.com/sdk/js) and [Python](https://docs.apify.com/sdk/python). You can start a run, read the dataset, and handle webhooks from your own app in a few lines. All you need is your Apify API token.

</details>

<details>
<summary><b>📤 How do I export the data?</b></summary>

Every Apify dataset can be downloaded in one click from the console as CSV, JSON, JSONL, Excel, HTML, XML, or RSS. You can also pull results programmatically via the [Apify API](https://docs.apify.com/api/v2) or stream them into BigQuery, S3, and other destinations through built-in integrations.

</details>

<details>
<summary><b>📅 Can I schedule the actor to run automatically?</b></summary>

Yes. Use the Apify scheduler to run the actor on any cadence, from hourly to monthly. Results are saved to your dataset and can be delivered to webhooks, email, Slack, cloud storage, or automation tools such as Zapier and Make.

***

</details>

### 🔌 Automating Semantic Scholar Scraper

- 🟢 **Node.js.** Install the apify-client NPM package.
- 🐍 **Python.** Use the apify-client PyPI package.
- 📚 See the [Apify API documentation](https://docs.apify.com/api/v2) for full details.

### 🔌 Integrate with any app

- [**Make**](https://docs.apify.com/platform/integrations/make) - Automate workflows
- [**Zapier**](https://docs.apify.com/platform/integrations/zapier) - Connect 5,000+ apps
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - Get notifications
- [**Airbyte**](https://docs.apify.com/platform/integrations/airbyte) - Data pipelines
- [**GitHub**](https://docs.apify.com/platform/integrations/github) - Trigger from commits
- [**Google Drive**](https://docs.apify.com/platform/integrations/drive) - Export to Sheets

***

### 🔗 Recommended Actors

- [**📚 Rate My Professors Scraper**](https://apify.com/parseforge/rate-my-professors-scraper) - Professor ratings
- [**🏥 ClinicalTrials.gov Scraper**](https://apify.com/parseforge/clinicaltrials-scraper) - Clinical trial data
- [**📰 PR Newswire Scraper**](https://apify.com/parseforge/pr-newswire-scraper) - Press releases
- [**📊 Indexmundi Scraper**](https://apify.com/parseforge/indexmundi-scraper) - Global indicators
- [**🔗 Broken Link Checker**](https://apify.com/parseforge/broken-link-checker) - URL validation

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more research and data scrapers.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) to request a new scraper, propose a custom data project, or report an issue.

***

> **⚠️ Disclaimer:** this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Semantic Scholar or the Allen Institute for AI. All trademarks mentioned are the property of their respective owners. Only publicly available academic metadata is collected.

# Actor input Schema

## `startUrl` (type: `string`):

Semantic Scholar search URL to start scraping from. Use this for custom searches with specific filters. Cannot be used together with searchQuery or other API filters. Example: https://www.semanticscholar.org/search?q=machine+learning\&sort=relevance

## `searchQuery` (type: `string`):

Search query to find papers on Semantic Scholar. Examples: 'machine learning', 'neural networks', 'quantum computing'. Required if startUrl is not provided.

## `yearMin` (type: `integer`):

Filter papers published on or after this year. Example: 2020

## `yearMax` (type: `integer`):

Filter papers published on or before this year. Example: 2024

## `hasPdf` (type: `boolean`):

Only include papers that have an open access PDF available.

## `maxItems` (type: `integer`):

Maximum number of papers to collect (up to 1,000,000). Leave empty for unlimited.

## `author` (type: `string`):

Filter papers by author name. Note: This is treated as a suggestion by the Semantic Scholar API, not a strict filter. Results may include papers that match the search query but may not have the specified author. Example: 'John Smith'

## `venues` (type: `string`):

Filter papers by publication venue (journal or conference). Note: This is treated as a suggestion by the Semantic Scholar API, not a strict filter. Results may include papers that match the search query but may not match the specified venue. Example: 'Nature', 'IEEE'

## Actor input object example

```json
{
  "startUrl": "https://www.semanticscholar.org/search?q=machine+learning&sort=relevance",
  "searchQuery": "machine learning",
  "hasPdf": false,
  "maxItems": 10
}
```

# Actor output Schema

## `papers` (type: `string`):

Complete dataset with all scraped academic papers including full details

## `overview` (type: `string`):

Overview view of papers with key fields displayed in a table format

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQuery": "machine learning",
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/semantic-scholar-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchQuery": "machine learning",
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/semantic-scholar-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQuery": "machine learning",
  "maxItems": 10
}' |
apify call parseforge/semantic-scholar-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/semantic-scholar-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Semantic Scholar Scraper",
        "description": "Extract detailed academic paper data from Semantic Scholar, including abstracts, citations, authors, and publication details. Ideal for researchers, academics, and analysts who need structured scholarly data for literature reviews, research workflows, and large-scale academic analysis.",
        "version": "1.0",
        "x-build-id": "BvKU6LcuLALaDyYoE"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~semantic-scholar-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-semantic-scholar-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~semantic-scholar-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-semantic-scholar-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~semantic-scholar-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-semantic-scholar-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrl": {
                        "title": "Start URL",
                        "type": "string",
                        "description": "Semantic Scholar search URL to start scraping from. Use this for custom searches with specific filters. Cannot be used together with searchQuery or other API filters. Example: https://www.semanticscholar.org/search?q=machine+learning&sort=relevance"
                    },
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search query to find papers on Semantic Scholar. Examples: 'machine learning', 'neural networks', 'quantum computing'. Required if startUrl is not provided."
                    },
                    "yearMin": {
                        "title": "Minimum Year",
                        "minimum": 1900,
                        "maximum": 2100,
                        "type": "integer",
                        "description": "Filter papers published on or after this year. Example: 2020"
                    },
                    "yearMax": {
                        "title": "Maximum Year",
                        "minimum": 1900,
                        "maximum": 2100,
                        "type": "integer",
                        "description": "Filter papers published on or before this year. Example: 2024"
                    },
                    "hasPdf": {
                        "title": "Has PDF",
                        "type": "boolean",
                        "description": "Only include papers that have an open access PDF available.",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Maximum number of papers to collect (up to 1,000,000). Leave empty for unlimited."
                    },
                    "author": {
                        "title": "Author",
                        "type": "string",
                        "description": "Filter papers by author name. Note: This is treated as a suggestion by the Semantic Scholar API, not a strict filter. Results may include papers that match the search query but may not have the specified author. Example: 'John Smith'"
                    },
                    "venues": {
                        "title": "Venues",
                        "type": "string",
                        "description": "Filter papers by publication venue (journal or conference). Note: This is treated as a suggestion by the Semantic Scholar API, not a strict filter. Results may include papers that match the search query but may not match the specified venue. Example: 'Nature', 'IEEE'"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
