# Discord Message Crawler (`lighthouse_keeper/discrawl`) Actor

Scrape entire discord servers, specific channels, or DMs with full metadata: message content, embeds, timestamps, reactions, mentions, polls, and more. If it's there, you will have it.

- **URL**: https://apify.com/lighthouse\_keeper/discrawl.md
- **Developed by:** [r. mann](https://apify.com/lighthouse_keeper) (community)
- **Categories:** Automation, Social media
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.20 / 1,000 messages

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Discord Message Crawler

Back up and archive Discord servers, channels, and direct messages with full metadata.

### Features

- Crawl entire servers automatically, or target specific channels
- Incremental "catch-up" runs plus full historical backfill, with resumable per-channel state
- Optional DM and group-DM export
- Time-bounded exports via snowflake ID or ISO 8601 timestamps
- Configurable request cooldown to stay within Discord rate limits
- Output is plain JSON, one item per message, ready for analysis or downstream processing

### Setup

Provide a Discord bot or user token for an account that already has access to the target servers/channels.
Bot tokens cover server/channel backups; DM and "include all DMs" features require a user token.

### Input schema

```jsonc
{
  "token": "",                  // Your Discord token. The only required field.

  "servers": [                  // Servers to back up. Each object represents one server.
                                // Duplicate the object as needed, or drop the array into your agent to populate.
    {
      "id": "",                 // Server (guild) ID
      "name": "",               // Optional cosmetic label, used only for logging/output. has no effect on crawling

      "catchup": true,          // Forward sync: fetch messages newer than the last message seen on a previous run.
                                 // Has no effect on the first run for a channel (there's nothing to catch up from yet).

      "backfill": true,          // Reverse sync: crawl historical messages backwards from the oldest seen message
                                  // towards the start of the channel, until the channel start is reached.

      "channels": [               // Optional. If omitted or empty, every accessible text channel in the
                                  // server is discovered and crawled automatically.
                                  // Each entry is either a channel ID, or { "id": "", "name"?: "" }.
        {
          "id": "",               // Channel ID
          "name": ""              // Optional cosmetic label, used only for logging/output
        }
      ]
    }
  ],
  "dmChannels": [                // Direct message / group-DM channels to back up (user tokens only).
    {
      "id": "",                  // Channel ID
      "name": ""                 // Optional cosmetic label, used only for logging/output
    }
  ],
  "includeAllDms": false,        // If true, discover and back up every open DM and group-DM on the account
                                 // (user tokens only).

  "backfillLimit": null,         // Optional cap on the number of messages backfilled per channel.
                                 // Leave null/empty for no limit (backfill runs to the channel start).

  "after": "",                   // Optional global lower bound. Only messages newer than this are crawled.
                                 // Accepts a Discord snowflake ID or an ISO 8601 timestamp.

  "before": "",                  // Optional global upper bound. Only messages older than this are crawled.
                                 // Accepts a Discord snowflake ID or an ISO 8601 timestamp.

  "requestCooldown": 3           // Minimum delay, in seconds, between Discord requests. Higher is gentler
                                 // on rate limits. recommended is 3 for a mix between efficiency 
                                 // and rate-limit avoidance.
}
````

#### Resumable state

Catch-up and backfill progress is tracked per-channel in a named key-value store
(`discord-crawler-state`), which is saved to your Apify account automatically and persists across runs.
Scheduled/incremental runs automatically continue from where the previous run left off.

### FAQ

**Where do i find my discord token?**

For a user token, you can grab it straight from your browser:

1. Log into Discord in Chrome (or another Chromium-based browser).
2. Open Developer Tools (3-dot menu > More tools > Developer tools) or press CTRL + SHIFT + I.
3. Switch to the "Console" tab, paste the snippet below, and press Enter. Your token will be copied to the clipboard automatically, and you'll see a "Worked!" message confirming it grabbed it successfully.

```js
(()=>{const c=window.webpackChunkdiscord_app;c.push([[Symbol()],{},r=>{if(!r.c)return;for(const mod of Object.values(r.c)){try{if(!mod.exports||mod.exports===window)continue;if(mod.exports?.getToken)return copy(mod.exports.getToken());for(const k in mod.exports){const v=mod.exports[k];if(v?.getToken&&v[Symbol.toStringTag]!=='IntlMessagesProxy')return copy(v.getToken());}}catch(e){}}}]);c.pop();console.log('%cWorked!','font-size:64px');console.log('%cYour token is now on the clipboard!','font-size:24px');})();
```

**What is catchup? What is backfill?**

Check the resumable state section above.
Note that these values are mutually exclusive: you can either catchup or backfill,
From a single point in time. Only one can be true.

**What does this actor actually crawl?**

Everything accessible to the token you provide.
Every message, in every targeted channel, including attachments metadata,
embeds, reactions, and other fields Discord returns, is crawled and pushed to the dataset as-is.
If you only want specific channels or a specific time range,
use the `channels`, `after`, and `before` fields to scope the run accordingly.

**The "API" tab's example code doesn't include my Discord token - is something missing?**

No, Apify's auto-generated code samples build the `run_input` from each input field's
prefill value, but deliberately omit fields marked as secret (like `token`) so a
placeholder token isn't baked into example code. Add it manually:

```python
run_input = {
    "token": "<YOUR_DISCORD_TOKEN>",
    "servers": [...]
}
```

**I'm getting `TypeError: 'int' object is not a mapping`, what do I do?**

TL;DR - make sure your server/channel IDs are strings and wrapped in "".

This means a server/guild ID or channel ID was entered as a bare number rather
than a string. Discord IDs are 64-bit integers, well beyond the 2^53 limit JavaScript
can represent exactly. When you hit 'copy server ID' on discord it copies a straight number,
and when you later paste it into Apify, the JS based parser rounds it to a near number.
fix: wrap every server, channel, and DM ID in quotes, e.g. `"id": "877994882399080558"`
instead of `"id": 877994882399080558`.

# Actor input Schema

## `token` (type: `string`):

Discord user/bot token. The account must already be a member of the servers you list below.

## `servers` (type: `array`):

List of servers to back up. Each entry: { "id": <guild id>, "name"?: <label>, "catchup"?: true, "backfill"?: true, "channels"?: \[...] }. Each channels entry is a channel id, or { "id": <channel id>, "name"?: <label> }. If "channels" is omitted or empty, every accessible text channel in the guild crawled automatically.

## `dmChannels` (type: `array`):

Direct-message or group channels to back up, by channel id. Each item is a channel id string, or { "id": <channel id>, "name"?: <label> }. Only available for user tokens.

## `includeAllDms` (type: `boolean`):

Enumerate and back up every open DM and group-DM on the account. Only available for user tokens.

## `backfillLimit` (type: `integer`):

Backfill limit - how many messages back you want to crawl. Leave empty for no cap.

## `after` (type: `string`):

Only messages newer than this. Accepts a Discord snowflake id or an ISO 8601 timestamp. (example: 2026-06-11T18:45:00Z)

## `before` (type: `string`):

Only messages older than this. Accepts a Discord snowflake id or an ISO 8601 timestamp (example: 2026-06-11T18:45:00Z).

## `requestCooldown` (type: `integer`):

Minimum delay between Discord requests. Higher is gentler on rate limits. Default is 3 seconds.

## Actor input object example

```json
{
  "servers": [
    {
      "id": "000000000000000000",
      "name": "Example Server",
      "catchup": true,
      "backfill": true,
      "channels": []
    }
  ],
  "includeAllDms": false,
  "requestCooldown": 3
}
```

# Actor output Schema

## `messages` (type: `string`):

Crawled messages from the configured servers, channels, and DMs.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "servers": [
        {
            "id": "000000000000000000",
            "name": "Example Server",
            "catchup": true,
            "backfill": true,
            "channels": []
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("lighthouse_keeper/discrawl").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "servers": [{
            "id": "000000000000000000",
            "name": "Example Server",
            "catchup": True,
            "backfill": True,
            "channels": [],
        }] }

# Run the Actor and wait for it to finish
run = client.actor("lighthouse_keeper/discrawl").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "servers": [
    {
      "id": "000000000000000000",
      "name": "Example Server",
      "catchup": true,
      "backfill": true,
      "channels": []
    }
  ]
}' |
apify call lighthouse_keeper/discrawl --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=lighthouse_keeper/discrawl",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Discord Message Crawler",
        "description": "Scrape entire discord servers, specific channels, or DMs with full metadata: message content, embeds, timestamps, reactions, mentions, polls, and more. If it's there, you will have it.",
        "version": "1.0",
        "x-build-id": "MlPP5vi0fdkf3bd9O"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/lighthouse_keeper~discrawl/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-lighthouse_keeper-discrawl",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/lighthouse_keeper~discrawl/runs": {
            "post": {
                "operationId": "runs-sync-lighthouse_keeper-discrawl",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/lighthouse_keeper~discrawl/run-sync": {
            "post": {
                "operationId": "run-sync-lighthouse_keeper-discrawl",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "token"
                ],
                "properties": {
                    "token": {
                        "title": "Discord token",
                        "type": "string",
                        "description": "Discord user/bot token. The account must already be a member of the servers you list below."
                    },
                    "servers": {
                        "title": "Servers / channels",
                        "type": "array",
                        "description": "List of servers to back up. Each entry: { \"id\": <guild id>, \"name\"?: <label>, \"catchup\"?: true, \"backfill\"?: true, \"channels\"?: [...] }. Each channels entry is a channel id, or { \"id\": <channel id>, \"name\"?: <label> }. If \"channels\" is omitted or empty, every accessible text channel in the guild crawled automatically."
                    },
                    "dmChannels": {
                        "title": "DM / group-DM channels",
                        "type": "array",
                        "description": "Direct-message or group channels to back up, by channel id. Each item is a channel id string, or { \"id\": <channel id>, \"name\"?: <label> }. Only available for user tokens."
                    },
                    "includeAllDms": {
                        "title": "Include all DMs",
                        "type": "boolean",
                        "description": "Enumerate and back up every open DM and group-DM on the account. Only available for user tokens.",
                        "default": false
                    },
                    "backfillLimit": {
                        "title": "Backfill limit (per channel)",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Backfill limit - how many messages back you want to crawl. Leave empty for no cap."
                    },
                    "after": {
                        "title": "After (lower bound)",
                        "type": "string",
                        "description": "Only messages newer than this. Accepts a Discord snowflake id or an ISO 8601 timestamp. (example: 2026-06-11T18:45:00Z)"
                    },
                    "before": {
                        "title": "Before (upper bound)",
                        "type": "string",
                        "description": "Only messages older than this. Accepts a Discord snowflake id or an ISO 8601 timestamp (example: 2026-06-11T18:45:00Z)."
                    },
                    "requestCooldown": {
                        "title": "Request cooldown (seconds)",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Minimum delay between Discord requests. Higher is gentler on rate limits. Default is 3 seconds.",
                        "default": 3
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
