# Deep Email, Phone, & Social Media Scraper Search (`peterasorensen/snacci`) Actor

A powerful tool that searches emails, phone numbers, and social media profiles from any website. It intelligently navigates, prioritizing pages likely to have contact info - even deep in the site. Perfect for lead generation, market research, competitive analysis, and building contact databases.

- **URL**: https://apify.com/peterasorensen/snacci.md
- **Developed by:** [peterasorensen](https://apify.com/peterasorensen) (community)
- **Categories:** Automation, Lead generation, Developer tools
- **Stats:** 3,828 total users, 162 monthly users, 100.0% runs succeeded, 176 bookmarks
- **User rating**: 4.53 out of 5 stars

## Pricing

from $9.00 / 1,000 emails

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Deep Web Scraper: Email, Phone, and Social Media Extractor

A powerful web scraping tool built on the Apify platform that extracts contact information including email addresses, phone numbers, and social media profiles from websites.

This actor crawls websites to extract valuable contact information, making it ideal for lead generation, market research, competitive analysis, and building contact databases. It not only crawls main pages, but also intelligently navigates through websites, prioritizing pages likely to contain contact information.

Now supporting DACH-region phone detection! (Germany Austria, Switzerland)

**Durability, durability, durability: v4.0 Update!** 🛡️
- **Memory leak fixes for large payloads** 🔧 – Resolved critical out-of-memory crashes that occurred when processing large input payloads. All memory leaks have been identified and fixed, ensuring stable performance even with extensive website lists.
- **DACH-region (Germany, Austria, Switzerland) phone detection 📞** – Improved detection of German, Austrian, and Swiss phone numbers with leading zeros, ensuring compatibility with sales tools in the DACH region. Nordics (SE, DK, FI, NO, IS) also get support by extension! ️‍🔥
- **JavaScript-heavy website support** 🚀 – Fixed critical bug that prevented proper crawling of JavaScript-heavy websites in the Playwright fallback crawler. Links from these sites are now properly queued and processed. Please retry any requests before June 4th, 2025 that failed.
- **Long running tasks** - 3 hour tasks reduced to minutes ️‍🔥. Some extremely long running tasks were due to an edge case of enqueuing links not adhering to a "same-domain" policy. 
- **Email detection improvements 📧** – Fixed issues with fake Wix-generated emails and resolved bug that incorrectly logged .png files as email addresses.

__


### Key Features:

1. **Bulk Website Processing**: Scrape multiple websites in a single run by providing a list of URLs.
2. **Multiple Contact Types**: Extract emails, phone numbers, and social media handles all in one run.
3. **Intelligent Crawling**: Automatically explores contact, about, and team pages for thorough information extraction.
4. **Dynamic Content Extraction with Playwright**: Automatically switches to a JavaScript-enabled browser (Playwright) for websites that hide contact info behind scripts or dynamic elements.
5. **Advanced Detection Patterns**: 
   - Email: Decodes Cloudflare encrypted emails; Utilizes regex patterns to find standard email formats and extracts emails from custom data attributes.
   - Phone: Supports international formats, US formats, UK formats, and more.
   - Social Media: Detects handles from 15+ popular platforms including Twitter/X, Facebook, Instagram, LinkedIn, etc.
6. **Duplicate Removal**: Ensures only unique contact information is collected from each website.
7. **Proxy Support**: Integrates with Apify's proxy services for reliable scraping.
8. **Detailed Logging**: Provides comprehensive console output for monitoring the scraping process.
9. **Error Handling**: Gracefully manages failed requests and continues scraping.
10. **Structured Output**: Saves results in a clean format, categorizing by contact type and associating found information with their source URLs.

### Use Cases:

- Lead generation for sales and marketing teams
- Building contact databases for outreach campaigns
- Competitive analysis and market research
- Verifying contact information for existing databases
- Social media influencer research and outreach
- Building comprehensive business directories

### How to Use:

1. Input a list of website URLs you want to scrape.
2. Select which types of contact information you want to extract (emails, phone numbers, social media handles, or all).
3. Run the actor and wait for completion.
4. Retrieve the collected contact information along with their source URLs from the actor's output.

#### Social Media Platforms
The scraper detects profiles from the following platforms:

1. **Twitter/X** - Professional networking and updates
2. **Facebook** - Personal and business pages
3. **Instagram** - Visual social networking
4. **LinkedIn** - Professional networking platform
5. **YouTube** - Video sharing platform
6. **TikTok** - Short-form video platform
7. **Pinterest** - Visual discovery engine
8. **GitHub** - Software development platform
9. **Reddit** - Discussion forums and communities
10. **Snapchat** - Ephemeral messaging app
11. **WhatsApp** - Messaging application
12. **Telegram** - Messaging and content platform
13. **Medium** - Publishing platform
14. **Discord** - Communication platform

___

#### Additional Keywords & Search Terms  
This Apify actor serves as a **bulk email, phone number, social media handle, and contact scraper**, **website information extractor**, and **lead generation tool**. It is ideal for users searching for:  

#### Lead Generation and Sales Prospecting
Find valuable contact information for potential customers and clients. Ideal for sales teams looking to build targeted prospect lists with verified contact methods including business emails and professional social media profiles like LinkedIn.

#### Competitive Analysis
Monitor competitors by extracting their contact information and social media presence. Track their online footprint across platforms like Twitter, Facebook, Instagram, and more to analyze their digital strategy.

#### Market Research
Gather contact information from companies in your target market. Extract emails, phone numbers, and social profiles like LinkedIn from industry-specific websites to build comprehensive market databases.

#### Recruitment
Find candidate contact information including professional networking profiles from LinkedIn, GitHub, or other industry-specific platforms. Build recruitment databases with direct contact methods.

#### Academic Research
Collect contact information for researchers, academics, or institutions. Extract emails and professional profiles from university websites, research papers, and academic social networks.

#### Content Creation and Marketing
Find influencers and content creators across platforms like Instagram, TikTok, YouTube, and Twitter. Build outreach lists with verified contact information for influencer marketing campaigns.

#### Additional Use Cases  

- **Real estate lead generation** – Extract contact details from realtor websites, property listings, and agencies.  
- **E-commerce supplier outreach** – Find and collect supplier contact information from directories and marketplaces.  
- **B2B networking and partnerships** – Scrape contact details of potential business partners and vendors.  
- **Freelancer prospecting** – Gather contact information from company websites for pitching services.  
- **Tech startup investor outreach** – Extract contact details from accelerator, VC, and funding directories.  
- **Educational research** – Collect contact information from university and institution websites.  
- **Legal and compliance investigations** – Gather business contacts for verification and due diligence.  
- **Government and non-profit outreach** – Find contact details for officials, organizations, and community groups.  
- **Recruitment and HR research** – Extract hiring manager and company contact information for job prospecting.  
- **Social media marketing** – Collect social media handles for targeted marketing campaigns.

This tool is perfect for businesses, sales teams, researchers, and marketers looking to automate contact information collection and scale their outreach efforts efficiently.

- **Extract contact information from website lists**  
- **Find emails and phone numbers on contact pages automatically**  
- **Scrape business contact details from multiple websites**  
- **Best contact scraper for sales leads**  
- **Automated web crawler for contact discovery**  
- **Collect contact information for cold outreach**  
- **Extract emails, phones, and social media from HTML pages**  
- **Lead generation scraper for marketing**  
- **Find hidden contact information on websites**  
- **B2B contact scraping tool**  
- **Apify actor for scraping contact details**  
- **Sales prospecting contact extractor**  
- **Marketing outreach scraper**  
- **Business contact finder**  
- **Social media handle extractor**  
- **Phone number scraper for websites**

# Actor input Schema

## `websites` (type: `array`):

Provide an array of websites to scrape for emails, phone numbers, and social media handles.
## `scrapeTypes` (type: `array`):

Select what types of contact information to scrape from the websites.
## `proxyConfiguration` (type: `object`):

Select proxies to be used by your crawler. Residential proxies in the target country (US by default) are recommended for better success rates.
## `removeDuplicates` (type: `boolean`):

If enabled, removes duplicate contact information even if found on separate webpages. If disabled, outputs all found information including duplicates.
## `maxDepth` (type: `integer`):

Maximum depth of pages to crawl from the starting URL. A depth of 0 means only the initial page, 1 means also crawl linked pages, 2 means crawl linked pages and their linked pages, etc.
## `maxLinksPerPage` (type: `integer`):

Maximum number of links to follow from each page. Set much higher if extracting a people directory or similar.

## Actor input object example

```json
{
  "websites": [
    "https://www.southampton.ac.uk/people"
  ],
  "scrapeTypes": [
    "emails",
    "phoneNumbers",
    "socialMedia"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "US"
  },
  "removeDuplicates": true,
  "maxDepth": 2,
  "maxLinksPerPage": 200
}
````

# Actor output Schema

## `results` (type: `string`):

Extracted contacts (emails, phone numbers) and social media links for each source URL.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "websites": [
        "https://www.southampton.ac.uk/people"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ],
        "apifyProxyCountry": "US"
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("peterasorensen/snacci").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "websites": ["https://www.southampton.ac.uk/people"],
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
        "apifyProxyCountry": "US",
    },
}

# Run the Actor and wait for it to finish
run = client.actor("peterasorensen/snacci").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "websites": [
    "https://www.southampton.ac.uk/people"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "US"
  }
}' |
apify call peterasorensen/snacci --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=peterasorensen/snacci",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Deep Email, Phone, & Social Media Scraper Search",
        "description": "A powerful tool that searches emails, phone numbers, and social media profiles from any website. It intelligently navigates, prioritizing pages likely to have contact info - even deep in the site. Perfect for lead generation, market research, competitive analysis, and building contact databases.",
        "version": "0.0",
        "x-build-id": "GM48NIx162nve1AGz"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/peterasorensen~snacci/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-peterasorensen-snacci",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/peterasorensen~snacci/runs": {
            "post": {
                "operationId": "runs-sync-peterasorensen-snacci",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/peterasorensen~snacci/run-sync": {
            "post": {
                "operationId": "run-sync-peterasorensen-snacci",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "websites",
                    "scrapeTypes"
                ],
                "properties": {
                    "websites": {
                        "title": "List of Websites",
                        "type": "array",
                        "description": "Provide an array of websites to scrape for emails, phone numbers, and social media handles.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "scrapeTypes": {
                        "title": "Information to Scrape",
                        "type": "array",
                        "description": "Select what types of contact information to scrape from the websites.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "emails",
                                "phoneNumbers",
                                "socialMedia"
                            ],
                            "enumTitles": [
                                "Email Addresses",
                                "Phone Numbers",
                                "Social Media Handles"
                            ]
                        },
                        "default": [
                            "emails",
                            "phoneNumbers",
                            "socialMedia"
                        ]
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Select proxies to be used by your crawler. Residential proxies in the target country (US by default) are recommended for better success rates."
                    },
                    "removeDuplicates": {
                        "title": "Remove Duplicates",
                        "type": "boolean",
                        "description": "If enabled, removes duplicate contact information even if found on separate webpages. If disabled, outputs all found information including duplicates.",
                        "default": true
                    },
                    "maxDepth": {
                        "title": "Maximum Crawl Depth",
                        "type": "integer",
                        "description": "Maximum depth of pages to crawl from the starting URL. A depth of 0 means only the initial page, 1 means also crawl linked pages, 2 means crawl linked pages and their linked pages, etc.",
                        "default": 2
                    },
                    "maxLinksPerPage": {
                        "title": "Maximum Links per Page",
                        "type": "integer",
                        "description": "Maximum number of links to follow from each page. Set much higher if extracting a people directory or similar.",
                        "default": 200
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
