# Website Contact Data Extractor (`techionik9993/website-contact-data-extractor`) Actor

Extract public business contact data from websites, including validated emails, phone numbers, contact/about pages, and social profiles. Delivers clean, deduplicated JSON output for CRM enrichment, lead generation, prospecting, research, and automation workflows.

- **URL**: https://apify.com/techionik9993/website-contact-data-extractor.md
- **Developed by:** [Techionik](https://apify.com/techionik9993) (community)
- **Categories:** Lead generation, SEO tools, Automation
- **Stats:** 14 total users, 0 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 website results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Website Contact Data Extractor

Extract public contact details and social profiles from company websites in a clean, structured format.

Website Contact Data Extractor is built for lead generation, CRM enrichment, sales research, market research, and automation workflows. Provide one or more business website URLs and the Actor scans a small set of high-signal pages such as home, contact, about, company, team, location, legal, imprint, and impressum pages.

### Features

- Extracts public email addresses from visible text, `mailto:` links, and JSON-LD structured data
- Extracts public phone numbers from `tel:` links, page contact sections, footers, and JSON-LD
- Detects contact and about page URLs
- Extracts social profile links from Facebook, Instagram, LinkedIn, Twitter/X, YouTube, and TikTok
- Crawls only same-website pages to keep results relevant
- Deduplicates and validates emails and phone numbers
- Returns one clean structured result per website
- Includes run summary output for quick quality checks

### Best For

- Sales prospecting
- CRM enrichment
- Lead list cleanup
- Company research
- Market research
- Agency prospecting
- Directory enrichment
- Automation pipelines with Apify, Make, n8n, Zapier, Google Sheets, Airtable, or custom APIs

### Input

#### `startUrls`

Add one or more public company or business website URLs. URLs without `https://` are accepted and normalized automatically.

The Actor automatically scans the homepage plus selected high-signal pages such as contact, about, company, team, location, legal, imprint, and impressum pages.

### Example Input

```json
{
    "startUrls": [
        { "url": "https://www.apify.com" },
        { "url": "https://stripe.com" },
        { "url": "https://www.shopify.com" }
    ]
}
````

### Output

Each dataset item represents one processed website.

```json
{
    "status": "ok",
    "websiteUrl": "https://www.example.com/",
    "domain": "example.com",
    "emails": ["info@example.com", "sales@example.com"],
    "phones": ["+1 800 123 4567"],
    "contactPage": "https://example.com/contact",
    "aboutPage": "https://example.com/about",
    "facebook": "https://facebook.com/example",
    "instagram": "https://instagram.com/example",
    "linkedin": "https://linkedin.com/company/example",
    "twitter": "https://x.com/example",
    "youtube": "https://youtube.com/@example",
    "tiktok": "https://www.tiktok.com/@example"
}
```

#### Status Values

- `ok`: Contact data, social links, or important pages were found.
- `empty`: The website was scanned successfully, but no useful contact data was found.
- `failed`: The website could not be loaded after retries.

### Output Fields

| Field         | Description                                   |
| ------------- | --------------------------------------------- |
| `status`      | Processing result: `ok`, `empty`, or `failed` |
| `websiteUrl`  | Original normalized website URL               |
| `domain`      | Website hostname without `www`                |
| `emails`      | Deduplicated public email addresses           |
| `phones`      | Deduplicated public phone numbers             |
| `contactPage` | Detected contact page URL                     |
| `aboutPage`   | Detected about/company/team page URL          |
| `facebook`    | Facebook page/profile URL                     |
| `instagram`   | Instagram profile URL                         |
| `linkedin`    | LinkedIn company/profile URL                  |
| `twitter`     | Twitter/X profile URL                         |
| `youtube`     | YouTube channel/profile URL                   |
| `tiktok`      | TikTok profile URL                            |

### How It Works

1. The Actor normalizes and validates input URLs.
2. It loads each website with Crawlee and Cheerio for fast, low-cost scraping.
3. It extracts emails, phone numbers, contact pages, about pages, and social links.
4. It parses JSON-LD structured data before scripts are removed.
5. It discovers and scans high-signal same-domain pages.
6. It deduplicates and validates extracted data.
7. It saves one result per website to the default dataset and stores run statistics in `SUMMARY`.

### Notes And Limitations

- Only publicly available website data is extracted.
- Login-only pages, CAPTCHA-protected pages, and contact forms without visible contact details are not supported.
- Heavily JavaScript-rendered websites may return fewer results because this Actor uses fast HTML scraping instead of a browser.
- Some websites intentionally do not publish emails or phone numbers.
- Phone extraction is conservative to avoid collecting dates, IDs, and tracking numbers.
- Always use extracted contact data responsibly and follow applicable privacy, anti-spam, and data protection laws.

### Pricing Recommendation

Recommended marketplace pricing: **paid per result**.

Suggested starting price: **$2.00 to $4.00 per 1,000 website results**, with a small free trial allowance if available. This pricing is easy for lead generation users to understand because they pay per processed website, while the Actor keeps compute cost low by using Cheerio instead of browser automation.

### Search Keywords

website contact extractor, email scraper, phone number scraper, company contact scraper, lead generation, CRM enrichment, business email finder, website social links, LinkedIn company finder, sales prospecting, contact data, public emails, Apify contact scraper

# Actor input Schema

## `startUrls` (type: `array`):

Add one or more company or business website URLs. The Actor will find public emails, phone numbers, contact pages, about pages, and social profiles.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.apify.com"
    },
    {
      "url": "https://stripe.com"
    },
    {
      "url": "https://www.shopify.com"
    }
  ]
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

## `summary` (type: `string`):

No description

## `json` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.apify.com"
        },
        {
            "url": "https://stripe.com"
        },
        {
            "url": "https://www.shopify.com"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("techionik9993/website-contact-data-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [
        { "url": "https://www.apify.com" },
        { "url": "https://stripe.com" },
        { "url": "https://www.shopify.com" },
    ] }

# Run the Actor and wait for it to finish
run = client.actor("techionik9993/website-contact-data-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.apify.com"
    },
    {
      "url": "https://stripe.com"
    },
    {
      "url": "https://www.shopify.com"
    }
  ]
}' |
apify call techionik9993/website-contact-data-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=techionik9993/website-contact-data-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Website Contact Data Extractor",
        "description": "Extract public business contact data from websites, including validated emails, phone numbers, contact/about pages, and social profiles. Delivers clean, deduplicated JSON output for CRM enrichment, lead generation, prospecting, research, and automation workflows.",
        "version": "1.1",
        "x-build-id": "yjEYTJdjg1KwMJPgY"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/techionik9993~website-contact-data-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-techionik9993-website-contact-data-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/techionik9993~website-contact-data-extractor/runs": {
            "post": {
                "operationId": "runs-sync-techionik9993-website-contact-data-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/techionik9993~website-contact-data-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-techionik9993-website-contact-data-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Website URLs",
                        "type": "array",
                        "description": "Add one or more company or business website URLs. The Actor will find public emails, phone numbers, contact pages, about pages, and social profiles.",
                        "items": {
                            "type": "object",
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "Website URL",
                                    "description": "Example: https://www.example.com"
                                }
                            },
                            "required": [
                                "url"
                            ]
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
