Pricing

Pay per event

Go to Apify Store

Substack Leaderboard Scraper

Try for free

📊 Scrape public Substack leaderboards for ranked newsletters, author details, subscriber labels, and publication URLs.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

What does Substack Leaderboard Scraper do?

Substack Leaderboard Scraper collects public rows from Substack category leaderboards such as Technology, Business, Culture, Finance, Food & Drink, News, and more.

It uses public Substack leaderboard data and saves one dataset row per ranked publication.

Typical results include:

🏆 leaderboard rank
🗂️ category name and slug
📈 ranking tab: Top Bestsellers or Rising
📰 publication name and URL
👤 author name and profile URL
👥 subscriber labels such as thousands of paid subscribers
🔗 Substack hostname and subdomain
🧭 source leaderboard URL

Who is it for?

Sponsorship and growth teams

Use the dataset to discover newsletters that already have audience traction in a niche.

Creator partnership teams

Find creators by category and collect publication metadata before outreach.

Monitor adjacent categories to understand who is rising and how top publications position themselves.

Market researchers

Build a structured view of the Substack creator market by category.

Agencies and media buyers

Export publication URLs, authors, subscriber labels, and descriptions for campaign planning.

Why use this actor?

Substack leaderboards are useful, but they are built for browsing, not analysis. This actor turns those public pages into structured rows that can be filtered, joined, deduplicated, and exported.

Benefits:

⚡ HTTP-only scraping for fast low-cost runs
🎯 category slug input instead of internal category IDs
📊 bestseller and rising ranking tabs
🧾 dataset rows ready for CSV, JSON, Excel, Airtable, or CRM imports
🔁 repeatable monitoring of the same categories over time

What data can you extract?

Field	Description
`categoryName`	Human-readable leaderboard category
`rankingLabel`	Top Bestsellers or Rising
`rank`	Rank within that category/ranking page
`publicationName`	Substack publication name
`publicationUrl`	Public publication URL
`description`	Public publication description or hero text
`authorName`	Public author name when available
`authorUrl`	Public Substack profile URL
`paidSubscriberLabel`	Paid subscriber range label from Substack
`subscriberLabel`	Broader subscriber label when available
`freeSubscriberCount`	Free subscriber count text when exposed
`hasPodcast`	Whether the publication has podcast support
`twitterScreenName`	Twitter/X screen name when exposed
`sourceUrl`	Leaderboard URL that produced the row

How much does it cost to scrape Substack leaderboard rows?

Pricing is pay per event:

Start event: $0.005 per run
Leaderboard row event: starts at about $0.00018 per saved row on the BRONZE tier, with lower per-row prices on higher Apify tiers

That means 1,000 saved leaderboard rows cost about $0.18 on the BRONZE tier plus the small run start fee before Apify platform charges or plan-specific details.

Quick start

Open the actor on Apify.
Enter one or more category slugs, for example technology and business.
Choose paid, rising, or both ranking tabs.
Set a small maxItems for your first run.
Start the actor.
Export the dataset as CSV, JSON, or Excel.

Input options

`categorySlugs`

List of Substack leaderboard category slugs.

Examples:

technology
business
culture
finance
news
food

`startUrls`

Optional direct leaderboard URLs.

Examples:

https://substack.com/leaderboard/technology
https://substack.com/leaderboard/technology/rising
https://substack.com/leaderboard/business/paid

`rankings`

Choose one or both:

paid for Top Bestsellers
rising for Rising publications

`maxItems`

Maximum rows saved across all selected categories and ranking tabs.

`includeAllCategories`

Set this to true to scrape every public category returned by Substack's leaderboard category API. Keep maxItems modest for the first run.

Example input

{
  "categorySlugs": ["technology", "business"],
  "rankings": ["paid", "rising"],
  "maxItems": 100,
  "includeAllCategories": false
}

Example output

{
  "category": "technology",
  "categoryName": "Technology",
  "categoryId": 4,
  "rankingType": "paid",
  "rankingLabel": "Top Bestsellers",
  "rank": 1,
  "publicationId": 6349492,
  "publicationName": "SemiAnalysis",
  "publicationUrl": "https://newsletter.semianalysis.com",
  "description": "Bridging the gap between the world's most important industry, semiconductors, and business.",
  "authorName": "Dylan Patel",
  "authorHandle": "semianalysis",
  "authorUrl": "https://substack.com/@semianalysis",
  "paidSubscriberLabel": "Thousands of paid subscribers",
  "subscriberLabel": "Hundreds of thousands of subscribers",
  "freeSubscriberCount": "287,000",
  "hasPodcast": false,
  "sourceUrl": "https://substack.com/leaderboard/technology/paid"
}

Tips for better results

Start with one or two categories.
Use both paid and rising when you want mature and emerging publications.
Use maxItems to control cost and dataset size.
Run the same input weekly to monitor ranking changes.
Combine with your CRM or spreadsheet to track outreach status.

Integrations

Google Sheets

Export the dataset as CSV and import it into Google Sheets for review and tagging.

Airtable

Use the Apify integration to sync publication rows into an Airtable base.

CRM systems

Use publication URLs, author names, and profile URLs as enrichment inputs for sponsorship outreach.

BI dashboards

Track category rank, subscriber labels, and rising publications over time.

API usage

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/substack-leaderboard-scraper').call({
  categorySlugs: ['technology'],
  rankings: ['paid', 'rising'],
  maxItems: 50
});
console.log(run.defaultDatasetId);

Python

from apify_client import ApifyClient
import os

client = ApifyClient(os.environ['APIFY_TOKEN'])
run = client.actor('automation-lab/substack-leaderboard-scraper').call(run_input={
    'categorySlugs': ['technology'],
    'rankings': ['paid', 'rising'],
    'maxItems': 50,
})
print(run['defaultDatasetId'])

cURL

curl -X POST "https://api.apify.com/v2/acts/automation-lab~substack-leaderboard-scraper/runs?token=$APIFY_TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"categorySlugs":["technology"],"rankings":["paid"],"maxItems":25}'

MCP usage

You can use this actor through Apify MCP tools in Claude Desktop or Claude Code.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/substack-leaderboard-scraper

Claude Code quick add:

$claude mcp add apify-substack-leaderboard https://mcp.apify.com/?tools=automation-lab/substack-leaderboard-scraper

Claude Desktop / JSON MCP config:

{
  "mcpServers": {
    "apify-substack-leaderboard": {
      "url": "https://mcp.apify.com/?tools=automation-lab/substack-leaderboard-scraper"
    }
  }
}

Example prompts:

"Scrape the Technology and Business Substack leaderboards and summarize top sponsorship targets."
"Find rising Substack newsletters in Finance and return publication URLs with subscriber labels."
"Export top Culture newsletters and group them by author details."

Data quality notes

Substack exposes subscriber counts as labels and rounded text, not always exact numbers. The actor preserves those public labels and adds magnitude fields when Substack provides them.

Some publications may not expose a Twitter/X handle, author bio, or podcast flag. Those fields are returned as null when unavailable.

FAQ

Troubleshooting

Why did I get fewer rows than `maxItems`?

The selected category/ranking combination may have fewer public rows than your limit, or the actor reached the end of available leaderboard pages.

Why are subscriber counts rounded?

Substack leaderboards typically show public ranges or rounded counts. The actor does not infer private exact subscriber totals.

Why was a category skipped?

Use the category slug from the public leaderboard URL. If Substack does not return that slug in its leaderboard category API, the actor skips it and logs a warning.

Legality

This actor collects publicly available information from public Substack leaderboard endpoints. You are responsible for using the data lawfully, respecting applicable terms, privacy rules, and outreach regulations.

Is scraping Substack leaderboards legal?

Yes, the actor is designed for public leaderboard data only. It does not access private dashboards, subscriber lists, paid posts, or account-only content.

Other automation-lab actors that may fit the same workflow:

Changelog

0.1

Initial version with public Substack category leaderboards, bestseller and rising ranking tabs, subscriber labels, author details, and publication URLs.

Limitations

The actor focuses on leaderboard rows. It does not scrape individual posts, paid content, private subscriber lists, or account-only dashboards.

Support

If a public Substack leaderboard category stops working, include the category slug, input JSON, run ID, and expected output when reporting the issue.

Substack Leaderboard Scraper 📊

easyapi/substack-leaderboard-scraper

Scrape detailed publication data from Substack leaderboards. Get comprehensive insights about top newsletters including subscriber counts, pricing, author details, and more. Perfect for newsletter research and market analysis.

EasyApi

Substack Leaderboards Scraper 📈📥 - Cheap

scrapestorm/substack-leaderboards-scraper---cheap

🔍 Scrape Substack Leaderboards Easily Enter a leaderboard URL (e.g. /browse/business) to collect top newsletters with author name, handle, bio, publication title, domain & more 🏅📊 Perfect for lead gen, influencer research, and automation with tools like Google Sheets ⚡🧩

Storm_Scraper

Substack Newsletter Scraper

dataharvest/substack-scraper

Scrape Substack newsletters, posts and comments.

Alex v

Substack Publications Scraper 📚

easyapi/substack-publications-scraper

Scrape detailed publication information from Substack based on keywords. Get comprehensive data about newsletters, authors, subscriber counts, and publication metrics in structured JSON format.

EasyApi

1.8

Substack Email Scraper

scraperx/substack-email-scraper

📧 Substack Email Scraper extracts verified subscriber emails from Substack newsletters for smarter outreach. Automate lead building for B2B sales, marketing, and research — fast, efficient, and developer-friendly. 🚀

ScraperX

Substack Scraper - Newsletters, Posts & Authors

logiover/substack-newsletter-scraper

Substack API alternative: scrape newsletters, posts & authors without login. Export Substack data to CSV/JSON. No key, no proxy.

Logiover

Substack Leaderboard Scraper

parsebird/substack-leaderboard-scraper

Scrape Substack leaderboard rankings across 30 categories. Extract top bestseller and rising publications with subscriber counts, pricing tiers, author details, and ranking metrics.

ParseBird

Substack Email Scraper

scrapapi/substack-email-scraper

ScrapAPI

Substack Scraper - posts, comments & authors

doggo/substack-scraper-posts-comments-authors

Scrape Substack newsletters at scale: full post archives with article text, comments, author profiles, and publication stats like subscriber counts. Works with any Substack URL or custom domain. Fast API-based scraping with no browser, pay per result. Export to CSV, JSON, Excel, or API.

Doggo

5.0

Substack Newsletter Scraper

red.cars/substack-newsletter-scraper

Extract newsletter content, subscriber data, and author insights from any Substack publication - no API key required!

AutomateLab

1.0

Substack Leaderboard Scraper

What does Substack Leaderboard Scraper do?

Who is it for?

Sponsorship and growth teams

Creator partnership teams

Newsletter operators

Market researchers

Agencies and media buyers

Why use this actor?

What data can you extract?

How much does it cost to scrape Substack leaderboard rows?

Quick start

Input options

categorySlugs

startUrls

rankings

maxItems

includeAllCategories

Example input

Example output

Tips for better results

Integrations

Google Sheets

Airtable

CRM systems

BI dashboards

API usage

Node.js

Python

cURL

MCP usage

Data quality notes

FAQ

Troubleshooting

Why did I get fewer rows than maxItems?

Why are subscriber counts rounded?

Why was a category skipped?

Legality

Is scraping Substack leaderboards legal?

Related scrapers

Changelog

0.1

Limitations

Support

You might also like

Substack Leaderboard Scraper 📊

Substack Leaderboards Scraper 📈📥 - Cheap

Substack Newsletter Scraper

Substack Publications Scraper 📚

Substack Email Scraper

Substack Scraper - Newsletters, Posts & Authors

Substack Leaderboard Scraper

Substack Email Scraper

Substack Scraper - posts, comments & authors

Substack Newsletter Scraper

`categorySlugs`

`startUrls`

`rankings`

`maxItems`

`includeAllCategories`

Why did I get fewer rows than `maxItems`?