# Website Contact & Email Scraper (`foo121/website-contact-scraper`) Actor

Extract emails, phone numbers and social links from any list of websites — with contact-page crawling and email validation. Bulk B2B lead enrichment, pay per result.

- **URL**: https://apify.com/foo121/website-contact-scraper.md
- **Developed by:** [ziv shay](https://apify.com/foo121) (community)
- **Categories:** Lead generation, Marketing
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$3.00 / 1,000 result items

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Website Contact & Email Scraper

Extract **emails, phone numbers and social links** from any list of websites — with
contact-page crawling, a **JS-render fallback** for SPA sites, and **email
validation**. Built for bulk B2B lead enrichment. Pay per result.

### What it does

Give it a list of websites (plain URLs, bare domains, or objects with a
`url`/`website` field). For each site it:

1. Fetches the homepage.
2. Optionally crawls common contact pages (`/contact`, `/contact-us`, `/about`,
   `/about-us`).
3. Extracts every email, phone number and social profile — **LinkedIn, Facebook,
   Twitter/X, Instagram, YouTube, TikTok, Telegram, WhatsApp, GitHub and
   Pinterest** — from `mailto:`/`tel:` links, visible text, **and inline
   JSON/JSON-LD** (many sites only ship their socials inside a Next.js data blob,
   which an anchor-only scraper misses).
4. **Validates** each email (syntax), tags role (`info@`, `sales@`) and
   `no-reply` addresses, and can drop them on request.
5. **JS-render fallback**: when a homepage is an empty single-page-app shell, it
   re-fetches with a full browser profile to recover contacts the plain crawler
   would miss.

One row is pushed per website → pay-per-result ready, export to CSV/JSON/Excel.

### Input

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `startUrls` | array (required) | — | Websites to scrape. URLs, bare domains, or `{ url }` objects. |
| `crawlContactPages` | boolean | `true` | Also fetch contact/about pages. |
| `contactPaths` | array | `/contact, /contact-us, /about, /about-us` | Extra relative paths to probe. |
| `dropRoleEmails` | boolean | `false` | Exclude role + no-reply addresses, keep personal only. |
| `renderJs` | boolean | `true` | Browser-profile retry for JS-rendered/SPA homepages. |
| `maxConcurrency` | integer | `10` | Sites crawled in parallel. |
| `requestTimeoutSecs` | integer | `20` | Per-page fetch timeout. |
| `proxyConfiguration` | object | Apify Proxy | Outbound proxy. |

### Output (one row per site)

```json
{
  "website": "https://acme.com",
  "domain": "acme.com",
  "emails": ["jane@acme.com", "info@acme.com"],
  "emailDetails": [
    { "email": "jane@acme.com", "type": "personal", "valid": true },
    { "email": "info@acme.com", "type": "role", "valid": true }
  ],
  "phones": ["+15125550123"],
  "linkedin": "https://linkedin.com/company/acme",
  "facebook": null, "twitter": "https://x.com/acme", "instagram": null,
  "youtube": "https://www.youtube.com/acme", "tiktok": null,
  "telegram": null, "whatsapp": null, "github": null, "pinterest": null,
  "hasEmail": true, "hasPhone": true,
  "emailCount": 2, "phoneCount": 1,
  "pagesScanned": 3, "jsFallbackUsed": false,
  "status": "scraped", "error": null,
  "scrapedAt": "2026-06-20T00:00:00.000Z"
}
````

`status` is one of `scraped`, `no_contacts_found`, `unreachable`, `error` — so a
single bad site never breaks the run.

### Notes on accuracy

Social detection is **host-anchored** (parses each link's host) rather than naive
substring matching, so a URL like `measure-ux.com` is never mis-tagged as Twitter/X
— a class of false positive that simple `.includes("x.com")` scrapers produce.
The biggest entrenched incumbent in this category is very strong (15+ platforms,
optional paid email verification); this actor is the cheaper, no-add-on option with
solid 10-platform coverage, the JSON-blob fallback above, and per-email role/no-reply
typing for clean segmentation.

# Actor input Schema

## `startUrls` (type: `array`):

List of websites to scrape for contact details. You can paste plain URLs or bare domains (e.g. "https://acme.com" or "acme.com"), or objects that contain a `url`/`website` field (e.g. the output of another scraper). Each entry produces ONE row in the dataset.

## `crawlContactPages` (type: `boolean`):

Also fetch common contact pages (/contact, /contact-us, /about, /about-us) in addition to the homepage to find more emails and phones.

## `contactPaths` (type: `array`):

Relative paths probed on each site (in addition to the homepage) when contact-page crawling is enabled.

## `dropRoleEmails` (type: `boolean`):

If enabled, role-based addresses (info@, sales@, support@…) and no-reply addresses are excluded, keeping only personal-looking emails. Default keeps everything and flags the type.

## `renderJs` (type: `boolean`):

If a static fetch returns an empty/app-shell page (JS-rendered SPA) with no contacts, retry with a browser-like fetch profile to recover emails the plain crawler would miss.

## `maxConcurrency` (type: `integer`):

How many sites to crawl in parallel.

## `requestTimeoutSecs` (type: `integer`):

Abort a single page fetch after this many seconds. A slow or dead site never stalls the run.

## `proxyConfiguration` (type: `object`):

Proxy used for outbound requests. Apify Residential or Datacenter proxy is recommended for scale and to avoid rate limits.

## Actor input object example

```json
{
  "startUrls": [
    "https://www.apify.com",
    "example.com",
    {
      "url": "https://example.org"
    }
  ],
  "crawlContactPages": true,
  "contactPaths": [
    "/contact",
    "/contact-us",
    "/about",
    "/about-us"
  ],
  "dropRoleEmails": false,
  "renderJs": true,
  "maxConcurrency": 10,
  "requestTimeoutSecs": 20,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://www.apify.com",
        "example.com",
        {
            "url": "https://example.org"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("foo121/website-contact-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [
        "https://www.apify.com",
        "example.com",
        { "url": "https://example.org" },
    ] }

# Run the Actor and wait for it to finish
run = client.actor("foo121/website-contact-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://www.apify.com",
    "example.com",
    {
      "url": "https://example.org"
    }
  ]
}' |
apify call foo121/website-contact-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=foo121/website-contact-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Website Contact & Email Scraper",
        "description": "Extract emails, phone numbers and social links from any list of websites — with contact-page crawling and email validation. Bulk B2B lead enrichment, pay per result.",
        "version": "1.0",
        "x-build-id": "rL6tNA0td7DjIXtAF"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/foo121~website-contact-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-foo121-website-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/foo121~website-contact-scraper/runs": {
            "post": {
                "operationId": "runs-sync-foo121-website-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/foo121~website-contact-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-foo121-website-contact-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Websites to scrape",
                        "type": "array",
                        "description": "List of websites to scrape for contact details. You can paste plain URLs or bare domains (e.g. \"https://acme.com\" or \"acme.com\"), or objects that contain a `url`/`website` field (e.g. the output of another scraper). Each entry produces ONE row in the dataset."
                    },
                    "crawlContactPages": {
                        "title": "Crawl contact pages",
                        "type": "boolean",
                        "description": "Also fetch common contact pages (/contact, /contact-us, /about, /about-us) in addition to the homepage to find more emails and phones.",
                        "default": true
                    },
                    "contactPaths": {
                        "title": "Extra contact paths to crawl",
                        "type": "array",
                        "description": "Relative paths probed on each site (in addition to the homepage) when contact-page crawling is enabled.",
                        "default": [
                            "/contact",
                            "/contact-us",
                            "/about",
                            "/about-us"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "dropRoleEmails": {
                        "title": "Drop role / no-reply emails",
                        "type": "boolean",
                        "description": "If enabled, role-based addresses (info@, sales@, support@…) and no-reply addresses are excluded, keeping only personal-looking emails. Default keeps everything and flags the type.",
                        "default": false
                    },
                    "renderJs": {
                        "title": "JS-render fallback",
                        "type": "boolean",
                        "description": "If a static fetch returns an empty/app-shell page (JS-rendered SPA) with no contacts, retry with a browser-like fetch profile to recover emails the plain crawler would miss.",
                        "default": true
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "How many sites to crawl in parallel.",
                        "default": 10
                    },
                    "requestTimeoutSecs": {
                        "title": "Per-request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Abort a single page fetch after this many seconds. A slow or dead site never stalls the run.",
                        "default": 20
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy used for outbound requests. Apify Residential or Datacenter proxy is recommended for scale and to avoid rate limits.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
