# Email Extractor (`gordian/email-extractor`) Actor

Find and extract email addresses from any website in seconds. This actor will crawl entire websites and return all emails after validation. Easy to use and extremely fast.

- **URL**: https://apify.com/gordian/email-extractor.md
- **Developed by:** [Gordian](https://apify.com/gordian) (community)
- **Categories:** Lead generation, Automation, Developer tools
- **Stats:** 542 total users, 66 monthly users, 90.2% runs succeeded, 5 bookmarks
- **User rating**: 2.05 out of 5 stars

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

Email Extractor is an Apify Actor that finds and collects email addresses from web pages you provide, with an option to follow links and discover more emails.

### 🎯 Why extract emails?

Use cases include:

- Lead generation and outreach lists
- Finding contact emails for support, press, or careers pages
- Verifying public contact details across a site
- Compliance and due‑diligence checks

### ✨ What can Email Extractor do?

This Actor:

- Crawls your provided URLs and optionally follows links discovered on those pages
- Extracts emails from full page HTML (not just visible text)
- Validates the email domain TLD against the official IANA list for better accuracy
- Deduplicates results across the whole run
- Records both the page where an email was found and the original source URL you submitted

### 🚀 How to run

Run on Apify Console:

1. Create a free Apify account: https://console.apify.com/sign-up
2. Click “Try for free” on this actor
3. Fill the input (see example below)
4. Click Start and wait for the run to finish
5. Download results (JSON, CSV, Excel) from the Dataset tab

### 💡 Output data

Each dataset item contains:

| Field | Description |
|-------|-------------|
| email | The extracted email address |
| url | Page URL where the email was found |
| sourceUrl | One of your input URLs from which this crawl originated |

#### Output example

```json
{
    "email": "hello@apify.com",
    "url": "https://apify.com/resources/nonprofits",
    "sourceUrl": "https://apify.com"
}
````

### 📥 Input

The Actor accepts these input parameters:

- `urls` (array, required): One or more page URLs to scan
- `crawl` (boolean, optional, default: true): Follow links discovered on the provided pages to find more emails

#### Input example

```json
{
    "urls": [
        "https://apify.com"
    ],
    "crawl": true
}
```

### 💰 How much does it cost to extract emails?

This actor is extremly cost-effective. Check the "Pricing" tab for more details.

With Apify's free tier, you get $5 of platform credits monthly for free, which you can use to test this actor for free.

Do you need to scrape more? [Upgrade to a paid plan](https://apify.com/pricing?fpr=7p4wu) which includes more platform credits and discounted pricing.

Tip 1: Provide multiple URLs in your input, that way you only pay the actor start cost once.

Tip 2: If you are doing a large run, consider increasing the RAM used per run to scrape faster. When doing a small run decrease the RAM used to reduce the actor start costs.

Tip 3: Upgrade to a higher plan to get discounted pricing. Link: [https://apify.com/pricing](https://apify.com/pricing?fpr=7p4wu)

### 🔗 Integrations

This Actor integrates seamlessly with:

- **Automation platforms** - Build no code workflows with [Make.com](https://www.make.com/en/register?pc=louisdeconinck), n8n, and Zapier
- **Webhooks** - Trigger actions when scraping completes through [webhooks](https://docs.apify.com/platform/integrations/webhooks?fpr=7p4wu)
- **Schedulers** - Run daily/weekly to track group growth with Apify's [Scheduler](https://docs.apify.com/schedules?fpr=7p4wu)
- **API** - Start runs and access data programmatically with the [Apify API](https://docs.apify.com/api/v2#/reference/actors/run-collection/run-actor?fpr=7p4wu)
- **Google Sheets** - Export directly to spreadsheets

### 👥 Who made this Actor?

Gordian is a specialised Apify web scraping agency founded by Louis Deconinck.

Louis is a top 1% Apify developer, Oxford University IT graduate, and creator of 70+ scrapers used by 1,000+ data professionals every month. He has scraped 10,000,000+ pages bypassing the most advanced anti-scraping protections.

- Apify AI Agent Hackathon Winner
- 300+ contributions in Apify Discord
- Former senior data engineer in EU banking

Looking for a custom data solution? Get in touch.

### ❓ FAQ

#### Do you validate emails?

We validate the top‑level domain against the official IANA list, deduplicate results and verify correct syntax.

#### Is it legal to scrape emails?

Yes, web scraping publicly available data is legal. This scraper only extracts information that is publicly visible.

For more information on web scraping legality, read this blog post: [Is web scraping legal?](https://blog.apify.com/is-web-scraping-legal?fpr=7p4wu)

#### Can I export data to CSV or Excel?

Yes, Apify supports exporting dataset results in multiple formats: JSON, CSV, Excel (XLSX), HTML, XML and RSS.

#### How do I get started?

[Make a free Apify account](https://console.apify.com/sign-up?fpr=7p4wu) to claim your free $5 usage and start scraping today by clicking "Try for free".

# Actor input Schema

## `urls` (type: `array`):

List of URLs to start crawling from.

## `crawl` (type: `boolean`):

If true, the actor will enqueue links found on the pages.

## `email` (type: `string`):

Your email address

## Actor input object example

```json
{
  "urls": [
    "https://apify.com"
  ],
  "crawl": true
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "urls": [
        "https://apify.com"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("gordian/email-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "urls": ["https://apify.com"] }

# Run the Actor and wait for it to finish
run = client.actor("gordian/email-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "urls": [
    "https://apify.com"
  ]
}' |
apify call gordian/email-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=gordian/email-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Email Extractor",
        "description": "Find and extract email addresses from any website in seconds. This actor will crawl entire websites and return all emails after validation. Easy to use and extremely fast.",
        "version": "0.0",
        "x-build-id": "ftbifh1tzpqce9all"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/gordian~email-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-gordian-email-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/gordian~email-extractor/runs": {
            "post": {
                "operationId": "runs-sync-gordian-email-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/gordian~email-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-gordian-email-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "urls"
                ],
                "properties": {
                    "urls": {
                        "title": "Start URLs",
                        "minItems": 1,
                        "type": "array",
                        "description": "List of URLs to start crawling from.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "crawl": {
                        "title": "Crawl",
                        "type": "boolean",
                        "description": "If true, the actor will enqueue links found on the pages.",
                        "default": true
                    },
                    "email": {
                        "title": "Email",
                        "pattern": "^[^@\\s]+@[^@\\s]+\\.[^@\\s]+$",
                        "type": "string",
                        "description": "Your email address"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
