# 🐘 Mastodon Scraper - Hashtags, Posts & Trends (`benthepythondev/mastodon-scraper`) Actor

Scrape Mastodon (any instance) via the public REST API — no login needed. Get hashtag posts, a user's posts, the public/federated timeline, trending posts, or profile data. Clean JSON with engagement counts, media & hashtags.

- **URL**: https://apify.com/benthepythondev/mastodon-scraper.md
- **Developed by:** [ben](https://apify.com/benthepythondev) (community)
- **Categories:** Social media
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🐘 Mastodon Scraper — Posts, Hashtags, Accounts & Trends

Extract **Mastodon** data from **any instance** (mastodon.social, mas.to,
fosstodon.org and thousands more) through the public REST API — hashtag feeds, a
user's posts, the public/federated timeline, trending posts, or profile data.
Mastodon is the largest open, federated social network, and its data is wide open:
pull clean, structured JSON with engagement counts, media, hashtags and author info,
**no login required** for public content. Export to JSON/CSV/Excel, run on a schedule,
call via API, or connect to Make, Zapier or n8n.

### 🐘 What is the Mastodon Scraper?

It turns any Mastodon instance into a structured dataset. Point it at a server, pick a
mode — hashtag posts, an account's posts, the public/federated timeline, trends, or a
profile — and it returns every matching record with full engagement metrics straight
from Mastodon's own REST API. HTML is stripped from post bodies for clean text, and
because it reads a JSON API instead of a headless browser, it's fast and cheap with no
residential proxy required.

#### What data does it extract?

- **Post text (HTML stripped), created date and language**
- **Engagement counts** — favourites, boosts (reblogs) and replies
- **Boost detection** (`is_reblog`) and the original author of a boosted post
- **Author info** — username, full `acct` handle, display name, follower count and URL
- **Hashtags** on each post and the matched search tag
- **Media URLs and media types** (images, video, gifv) attached to each post
- **Profile data** — bio, followers, following, status count, avatar, join date, bot flag
- **Post and account URLs**, plus a `scraped_at` timestamp

### ⬇️ Input

Choose an `instance` and a `mode`, then supply `hashtags` or `accounts` as needed:

| Field | Description |
|-------|-------------|
| `mode` | `hashtag`, `account`, `public`, `trends`, `profile` or `search` |
| `instance` | Mastodon server to query, e.g. `mastodon.social`, `mas.to`, `fosstodon.org` |
| `hashtags` | Tags to scrape without the # (hashtag mode), e.g. `news`, `art`, `bitcoin` |
| `accounts` | Handles to scrape, e.g. `Gargron` or `Gargron@mastodon.social` |
| `query` | Keyword to search statuses for (search mode — needs a token) |
| `localOnly` | In public mode, only this instance's posts vs the full federated timeline |
| `maxItems` | Max records to return, split across hashtags/accounts (1–50000) |
| `accessToken` | Optional API token, only needed for keyword search |
| `proxyConfiguration` | Optional Apify Proxy for IP rotation on large runs |

#### Example input

```json
{
  "mode": "hashtag",
  "instance": "mastodon.social",
  "hashtags": ["bitcoin", "ai"],
  "maxItems": 500
}
````

### ⬆️ Output

Every post (or profile) is one clean row — view it as a **table**, or export
**JSON / CSV / Excel**:

```json
{
  "id": "123456789",
  "url": "https://mastodon.social/@Gargron/123456789",
  "created_at": "2026-06-26T09:00:00.000Z",
  "text": "Mastodon keeps growing!",
  "language": "en",
  "replies_count": 33,
  "reblogs_count": 210,
  "favourites_count": 540,
  "is_reblog": false,
  "in_reply_to_id": null,
  "author_username": "Gargron",
  "author_acct": "Gargron",
  "author_display_name": "Eugen",
  "author_followers": 280000,
  "author_url": "https://mastodon.social/@Gargron",
  "original_author_acct": null,
  "hashtags": ["mastodon"],
  "media_urls": [],
  "media_types": [],
  "scraped_at": "2026-06-26T15:30:00.000Z"
}
```

### 💡 Use cases

- **👂 Social listening & brand monitoring:** track a topic, product or brand across the whole fediverse.
- **📈 Trend & sentiment analysis:** feed hashtag and trending streams straight into an LLM.
- **🔬 Open-social research:** study communities, federation and how content spreads between servers.
- **🔍 Influencer & audience research:** profile stats and posting activity to find who leads a niche.

### ❓ FAQ

**How do I scrape Mastodon hashtags?** Set `mode: hashtag`, choose an `instance`, add
one or more `hashtags` (without the #), and Run. You get every recent post for those
tags with text, engagement counts, media and author info.

**Do I need an API key or login?** No — hashtags, accounts, public timelines, trends
and profiles all work with no login, straight from each instance's public REST API.
Only keyword `search` mode needs an access token.

**Does it work on any instance?** Yes — point `instance` at any Mastodon server. It
defaults to `mastodon.social` (the largest, which sees most of the federated
timeline), but niche servers like `fosstodon.org` are great for niche communities.

**Can I scrape remote accounts on another server?** Yes — use the full
`user@instance` handle (e.g. `Gargron@mastodon.social`) and the queried instance
resolves it for you.

**How do I get an access token for search?** In your Mastodon account go to
Preferences → Development → New application, create one, and paste its access token
into `accessToken`. It's only required for keyword `search`; every other mode is
login-free.

**What's the difference between local and federated in public mode?** With
`localOnly: true` you get only posts from the chosen instance; with it off you get the
whole federated timeline that instance can see.

**How many records can it return?** Up to your `maxItems` cap (up to 50,000); it
paginates automatically and splits the cap across the hashtags or accounts you give it.

**Can I run it on a schedule or via API?** Yes — schedule recurring runs in Apify,
call it via the API/SDK, or connect it to Make, Zapier or n8n.

**Is scraping Mastodon legal?** It reads publicly available data via Mastodon's own
public API. Use it responsibly for research and monitoring, and follow applicable
laws and each instance's terms.

### 🔗 You might also like

- **[Bluesky Scraper](https://apify.com/benthepythondev/bluesky-scraper)** — posts, profiles, followers & search
- **[Lemmy Scraper](https://apify.com/benthepythondev/lemmy-scraper)** — the federated Reddit alternative
- **[Reddit Scraper](https://apify.com/benthepythondev/reddit-scraper)** — posts, comments & communities
- **[Instagram Scraper](https://apify.com/benthepythondev/instagram-scraper)** — posts, profiles & hashtags

***

**Keywords:** Mastodon scraper, Mastodon API, fediverse scraper, ActivityPub, Mastodon posts, Mastodon hashtag scraper, Mastodon trends, Mastodon profile scraper, mastodon.social scraper, social media scraper, social listening, sentiment analysis, open social data, fediverse data export, Twitter alternative data.

# Actor input Schema

## `mode` (type: `string`):

hashtag = posts for a #tag; account = a user's posts; public = the public/federated timeline; trends = trending posts; profile = account data.

## `instance` (type: `string`):

The Mastodon server to query, e.g. 'mastodon.social', 'mas.to', 'fosstodon.org'. Public data needs no login.

## `hashtags` (type: `array`):

Tags to scrape (without the #), e.g. 'news', 'art', 'bitcoin'.

## `accounts` (type: `array`):

Handles to scrape, e.g. 'Gargron' or 'Gargron@mastodon.social'. The leading @ is optional.

## `query` (type: `string`):

Keyword to search statuses for. Requires an access token (most instances).

## `localOnly` (type: `boolean`):

In public mode, only posts from the chosen instance (otherwise the whole federated timeline).

## `maxItems` (type: `integer`):

Maximum number of records to return (split across hashtags/accounts).

## `accessToken` (type: `string`):

Optional Mastodon API access token (Preferences > Development > New application). Only needed for keyword search or private data.

## `proxyConfiguration` (type: `object`):

Optional. Mastodon's API is public, so a proxy is not required; Apify Proxy (auto) is fine for IP rotation on large runs.

## Actor input object example

```json
{
  "mode": "hashtag",
  "instance": "mastodon.social",
  "hashtags": [
    "news"
  ],
  "localOnly": false,
  "maxItems": 100,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "hashtag",
    "instance": "mastodon.social",
    "hashtags": [
        "news"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("benthepythondev/mastodon-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "hashtag",
    "instance": "mastodon.social",
    "hashtags": ["news"],
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("benthepythondev/mastodon-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "hashtag",
  "instance": "mastodon.social",
  "hashtags": [
    "news"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call benthepythondev/mastodon-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=benthepythondev/mastodon-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "🐘 Mastodon Scraper - Hashtags, Posts & Trends",
        "description": "Scrape Mastodon (any instance) via the public REST API — no login needed. Get hashtag posts, a user's posts, the public/federated timeline, trending posts, or profile data. Clean JSON with engagement counts, media & hashtags.",
        "version": "0.1",
        "x-build-id": "V6b0S4cJdEWTQfBqT"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/benthepythondev~mastodon-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-benthepythondev-mastodon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/benthepythondev~mastodon-scraper/runs": {
            "post": {
                "operationId": "runs-sync-benthepythondev-mastodon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/benthepythondev~mastodon-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-benthepythondev-mastodon-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "mode": {
                        "title": "What to scrape",
                        "enum": [
                            "hashtag",
                            "account",
                            "public",
                            "trends",
                            "profile",
                            "search"
                        ],
                        "type": "string",
                        "description": "hashtag = posts for a #tag; account = a user's posts; public = the public/federated timeline; trends = trending posts; profile = account data.",
                        "default": "hashtag"
                    },
                    "instance": {
                        "title": "Mastodon instance",
                        "type": "string",
                        "description": "The Mastodon server to query, e.g. 'mastodon.social', 'mas.to', 'fosstodon.org'. Public data needs no login.",
                        "default": "mastodon.social"
                    },
                    "hashtags": {
                        "title": "Hashtags (for hashtag mode)",
                        "type": "array",
                        "description": "Tags to scrape (without the #), e.g. 'news', 'art', 'bitcoin'.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "accounts": {
                        "title": "Accounts (for account / profile mode)",
                        "type": "array",
                        "description": "Handles to scrape, e.g. 'Gargron' or 'Gargron@mastodon.social'. The leading @ is optional.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "query": {
                        "title": "Search query (search mode)",
                        "type": "string",
                        "description": "Keyword to search statuses for. Requires an access token (most instances)."
                    },
                    "localOnly": {
                        "title": "Local only (public timeline)",
                        "type": "boolean",
                        "description": "In public mode, only posts from the chosen instance (otherwise the whole federated timeline).",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 50000,
                        "type": "integer",
                        "description": "Maximum number of records to return (split across hashtags/accounts).",
                        "default": 100
                    },
                    "accessToken": {
                        "title": "Access token (optional)",
                        "type": "string",
                        "description": "Optional Mastodon API access token (Preferences > Development > New application). Only needed for keyword search or private data."
                    },
                    "proxyConfiguration": {
                        "title": "Proxy",
                        "type": "object",
                        "description": "Optional. Mastodon's API is public, so a proxy is not required; Apify Proxy (auto) is fine for IP rotation on large runs.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
