# thebluebook.com Scraper (`fayoussef/thebluebook-scraper`) Actor

Scrape verified contractor and subcontractor profiles from thebluebook.com — the largest US construction directory — into clean JSON, CSV or Excel. Get company names, phones, emails, addresses, trade categories, certifications (MBE/WBE/DBE), project history and key contacts.

- **URL**: https://apify.com/fayoussef/thebluebook-scraper.md
- **Developed by:** [youssef farhan](https://apify.com/fayoussef) (community)
- **Categories:** Automation, Lead generation, Developer tools
- **Stats:** 14 total users, 6 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Blue Book Scraper – Extract Contractor Leads, Emails & Project Data from thebluebook.com

Scrape verified contractor and subcontractor profiles from thebluebook.com — the largest US construction directory — into clean JSON, CSV or Excel. Get company names, phones, emails, addresses, trade categories, certifications (MBE/WBE/DBE), project history and key contacts. Pay $0.001 per profile.

### What this actor does

This is a **production-grade Blue Book scraper** for thebluebook.com that turns construction directory pages into structured datasets. Paste a search URL (electricians in Texas, plumbers in California, GCs in Florida — any trade, any state) or a list of direct profile URLs, and the actor returns one clean record per company.

Built for **B2B lead generation, construction market research, supplier prospecting, recruiter sourcing, and procurement compliance** — anywhere you need verified contractor data at scale.

### Output fields

Each scraped company returns:

**Identity**
- `name`, `company_id`, `profile_link`, `city_state`, `full_address`, `website`, `scrape_date`

**Contact data**
- `phone` — main office phone
- `email` — best contact email discovered for the company (domain-matched emails ranked above generic inboxes)
- `contact_name`, `contact_role`, `contact_phone` — primary key contact
- `contacts[]` — all key contacts with name, role, direct phone

**Business profile**
- `trade[]` — trade categories (Electrical, Plumbing, HVAC, Concrete, etc.)
- `service_area[]` — counties / regions serviced
- `certifications[]` — diversity certifications (MBE, WBE, DBE, SBE, VOSB, HUB)
- `established`, `company_size`, `annual_volume`, `listed_since`

**Project history**
- `projects[]` — past projects with name, type, location, status, date, general contractor

### Sample output

```json
{
  "profile_link": "https://www.thebluebook.com/iProView/1424972",
  "company_id": "1424972",
  "name": "ABC Electrical Contractors",
  "phone": "(713) 555-0100",
  "email": "info@abcelectrical.com",
  "website": "https://www.abcelectrical.com",
  "city_state": "Houston, TX",
  "full_address": "1234 Main St, Houston, TX 77002",
  "trade": ["Electrical Contractors", "Lighting Contractors"],
  "certifications": ["MBE", "DBE"],
  "service_area": ["Harris County", "Fort Bend County", "Montgomery County"],
  "established": "1998",
  "company_size": "10-24 Employees",
  "annual_volume": "$1M - $5M",
  "listed_since": "2005",
  "scrape_date": "2026-04-10",
  "contact_name": "John Smith",
  "contact_role": "Owners, Principals & Senior Executives",
  "contact_phone": "(713) 555-0101",
  "contacts": [
    { "name": "John Smith", "role": "Owners, Principals & Senior Executives", "phone": "(713) 555-0101" }
  ],
  "projects": [
    {
      "project_name": "Downtown Office Tower",
      "project_location": "Houston, TX",
      "project_type": "Commercial",
      "project_status": "Completed",
      "project_date": "Mar 2023",
      "gc_role": "General Contractor",
      "gc_name": "Turner Construction"
    }
  ]
}
````

### Input

```json
{
  "startUrls": [
    { "url": "https://www.thebluebook.com/iSearch/results/tx/houston/electrical-contractors/sc/261/" },
    { "url": "https://www.thebluebook.com/iSearch/results/ca/los-angeles/general-contractors/sc/240/" }
  ],
  "maxItems": 500
}
```

To scrape a single company, pass its profile URL directly:

```json
{
  "startUrls": [{ "url": "https://www.thebluebook.com/iProView/1424972" }]
}
```

**Inputs accepted:**

- `startUrls` — any thebluebook.com search-result URL or direct `/iProView/{id}` profile URL
- `maxItems` — optional cap on the number of profiles to return
- `disableEmailScrape` — skip website email discovery for faster, cheaper runs

### Common use cases

- **Construction material suppliers** — build outreach lists targeted by trade and region
- **Sales & CRM teams** — enrich leads with verified phone, email, address and project history
- **Staffing & recruiting agencies** — source contractor companies filtered by headcount and revenue
- **Procurement & compliance** — find MBE / WBE / DBE-certified contractors for diversity sourcing
- **Market research & proptech** — benchmark contractor density and project volume by metro
- **Real estate developers** — identify active subcontractors by region for upcoming builds

### FAQ

#### What is thebluebook.com?

The Blue Book Network is the largest commercial construction directory in the United States, listing hundreds of thousands of contractors, subcontractors, suppliers and service providers by trade and region.

#### What does this Blue Book scraper extract?

Company name, phone, email, address, website, trade categories, service area, diversity certifications, year established, company size, annual revenue, key contacts and past projects — see the full field list above.

#### Is the data fresh?

Yes. Every run pulls live data directly from thebluebook.com — no stale cached datasets.

#### What output formats are supported?

JSON, CSV, Excel (XLSX), XML and JSONL. Download from the Apify Console or fetch via API.

#### Can I scrape a specific trade or region?

Yes. Build the relevant thebluebook.com search URL for your target trade and location, then pass it as a `startUrl`. The actor handles pagination automatically.

#### How are emails discovered?

The actor visits each company's own website and extracts emails from the page. Domain-matched addresses (e.g. `sales@companysite.com`) are returned before generic Gmail / Yahoo / Hotmail inboxes.

#### Does it handle pagination?

Yes — the actor automatically detects total page count and crawls every result page. You only need to provide the first search URL.

#### Can I run it on a schedule?

Yes. Use Apify's built-in scheduler for daily, weekly or monthly runs, or trigger via webhook / API on any external event.

#### What if a company has no website or no listed contacts?

The `email`, `full_address` and `contacts` fields will be empty. The core company record is always returned.

#### Is this legal?

The actor collects publicly accessible business directory information. You are responsible for complying with applicable laws, terms of service and data-protection regulations (GDPR, CCPA) when using the data.

### Use via API or MCP

Trigger this actor from the [Apify API](https://docs.apify.com/api/v2) or as an **MCP server** for AI agents (Claude, ChatGPT, Cursor, Perplexity):

```
https://mcp.apify.com/actors/fayoussef/thebluebook-scraper
```

AI agents can launch runs, pass input and read structured output without any manual steps.

### Need a custom scraper?

Need a different site, custom fields or a managed data pipeline? Visit [automationbyexperts.com](https://automationbyexperts.com) for tailored builds, retainers and data-as-a-service.

# Actor input Schema

## `startUrls` (type: `array`):

One or more thebluebook.com URLs to scrape.

Supported formats:
• Search results page — the actor will paginate through all results:
https://www.thebluebook.com/iSearch/results/tx/houston/electrical-contractors/sc/261/
• Legacy search URL:
https://www.thebluebook.com/search.html?region=2\&class=3370\&searchTerm=Plumbing+Contractors
• Direct company profile — scrapes a single company:
https://www.thebluebook.com/iProView/1424972

## `maxItems` (type: `integer`):

Maximum number of company profiles to scrape and save. Set to 0 or leave empty for no limit — the actor will scrape all discovered profiles.

## `disableEmailScrape` (type: `boolean`):

When enabled, the actor will NOT visit each company's external website to extract a contact email. This is typically 2-3x faster, but the 'email' field will be empty. Leave unchecked to scrape emails (slower, but more complete data).

## `proxyUrl` (type: `string`):

Optional. Provide your own proxy URL to use instead of the built-in Apify residential proxy. Must be a full proxy URL including scheme and (if needed) credentials, e.g. `http://user:pass@host:port`, `https://host:port`, or `socks5://user:pass@host:port`. Leave empty to use the default Apify residential proxy.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.thebluebook.com/search.html?region=2&searchsrc=index&class=3370&searchTerm=Plumbing%20Contractors"
    }
  ],
  "maxItems": 0,
  "disableEmailScrape": false
}
```

# Actor output Schema

## `results` (type: `string`):

thebluebook scraper

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.thebluebook.com/search.html?region=2&searchsrc=index&class=3370&searchTerm=Plumbing%20Contractors"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("fayoussef/thebluebook-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [{ "url": "https://www.thebluebook.com/search.html?region=2&searchsrc=index&class=3370&searchTerm=Plumbing%20Contractors" }] }

# Run the Actor and wait for it to finish
run = client.actor("fayoussef/thebluebook-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.thebluebook.com/search.html?region=2&searchsrc=index&class=3370&searchTerm=Plumbing%20Contractors"
    }
  ]
}' |
apify call fayoussef/thebluebook-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=fayoussef/thebluebook-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "thebluebook.com Scraper",
        "description": "Scrape verified contractor and subcontractor profiles from thebluebook.com — the largest US construction directory — into clean JSON, CSV or Excel. Get company names, phones, emails, addresses, trade categories, certifications (MBE/WBE/DBE), project history and key contacts.",
        "version": "1.0",
        "x-build-id": "Hosli9VJNjq8drtz3"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/fayoussef~thebluebook-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-fayoussef-thebluebook-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/fayoussef~thebluebook-scraper/runs": {
            "post": {
                "operationId": "runs-sync-fayoussef-thebluebook-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/fayoussef~thebluebook-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-fayoussef-thebluebook-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "One or more thebluebook.com URLs to scrape.\n\nSupported formats:\n• Search results page — the actor will paginate through all results:\n  https://www.thebluebook.com/iSearch/results/tx/houston/electrical-contractors/sc/261/\n• Legacy search URL:\n  https://www.thebluebook.com/search.html?region=2&class=3370&searchTerm=Plumbing+Contractors\n• Direct company profile — scrapes a single company:\n  https://www.thebluebook.com/iProView/1424972",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of company profiles to scrape and save. Set to 0 or leave empty for no limit — the actor will scrape all discovered profiles.",
                        "default": 0
                    },
                    "disableEmailScrape": {
                        "title": "⚡ Disable email scraping (faster)",
                        "type": "boolean",
                        "description": "When enabled, the actor will NOT visit each company's external website to extract a contact email. This is typically 2-3x faster, but the 'email' field will be empty. Leave unchecked to scrape emails (slower, but more complete data).",
                        "default": false
                    },
                    "proxyUrl": {
                        "title": "Custom proxy URL (optional)",
                        "type": "string",
                        "description": "Optional. Provide your own proxy URL to use instead of the built-in Apify residential proxy. Must be a full proxy URL including scheme and (if needed) credentials, e.g. `http://user:pass@host:port`, `https://host:port`, or `socks5://user:pass@host:port`. Leave empty to use the default Apify residential proxy."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
