PagesJaunes Scraper 🇫🇷 Email, SIRET, GPS — France Leads
Pricing
from $0.96 / 1,000 business records
PagesJaunes Scraper 🇫🇷 Email, SIRET, GPS — France Leads
Scrape PagesJaunes (French Yellow Pages) by city + category. Returns business name, phone, email (website-enriched), website, full address, GPS, SIRET/SIREN, rating, opening hours + a data-quality score. The richest French B2B lead dataset on the Store. Pay per result.
Pricing
from $0.96 / 1,000 business records
Rating
0.0
(0)
Developer
Vitalii Bondarev
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
French B2B leads from PagesJaunes — phone, email, SIRET/SIREN, GPS, website, opening hours. Pay per result, no monthly fee.
PagesJaunes is the authoritative French business directory — 3M+ listings covering every city and trade. This scraper turns it into clean, structured B2B lead records with the richest field set on the Store for this source: registration numbers (SIRET/SIREN), GPS coordinates, and website-enriched email that other scrapers don't provide. Build a targeted list of plumbers in Paris, real-estate agencies in Lyon, or lawyers in Marseille in minutes.
What it does
For each search (city + category) the actor:
- Fetches the
/annuaire/<city>/<category>listing pages (~20 results/page, up to ~480 per search). - Fetches each business detail page (
/pros/<id>) for phone, SIRET/SIREN, GPS, rating, hours, website. - When a website is present, crawls it for a public B2B email (see Email enrichment).
- Returns a flat, normalized record per business with a data-quality score.
Input
| Field | Type | Required | Description |
|---|---|---|---|
city | string | yes | City slug (e.g. paris-75, lyon-69, marseille-13) |
category | string | yes | Category slug (e.g. plombiers, restaurants, agences-immobilieres) |
maxItems | integer | no | Max results (0 = all available). Default: 40 |
enrichEmail | boolean | no | Crawl business websites for a public email. Default: on |
City slugs = lowercase city name + hyphen + department number (paris-75, lyon-69, marseille-13, bordeaux-33).
Category slugs = the French plural noun as in the URL (plombiers, restaurants, avocats, agences-immobilieres).
Output schema
| Field | Type | Description |
|---|---|---|
name | string | Business display name |
phone | string | Phone number (French format) |
email | string | Public B2B email, website-enriched (null if none public) |
website | string | Official business website URL |
street | string | Street address |
postalCode | string | French 5-digit postal code |
city | string | City name |
latitude / longitude | number | GPS coordinates |
siret / siren | string | Official French business registration numbers |
detailUrl | string | Canonical PagesJaunes URL (/pros/ID) |
image | string | Business photo URL |
rating / reviewCount | number / integer | Aggregate rating + review count |
categoryType | string | Schema.org business type (e.g. "Plumber") |
openingHours | string | Opening hours |
description | string | Business description |
parse_confidence | number | Data-quality score 0.0–1.0 — unique to this Actor |
warnings | array | Machine-readable quality-warning codes |
Why this beats other PagesJaunes scrapers
| This Actor | Typical PagesJaunes scrapers | |
|---|---|---|
| SIRET / SIREN | ✅ | Sometimes |
| GPS coordinates | ✅ | Rarely |
| ✅ verified, website-sourced, domain-matching | Often guessed (spam risk) or absent | |
| Opening hours (structured) | ✅ | Rarely |
| Data-quality score | ✅ unique | ✗ |
| Reliability | Multi-tier transport with IP-rotation + browser fallback | Single-tier (breaks on Cloudflare) |
Every record ships a parse_confidence score — below 0.7 is a machine-readable signal your pipeline can filter automatically. No other PagesJaunes scraper offers this.
How to scrape email from PagesJaunes
PagesJaunes deliberately hides email behind an internal contact form — there is no email on the listing page. When a business publishes its own website, this Actor crawls that site (homepage + legal/contact pages) for a public, verified B2B email, filters out consumer webmails (gmail, orange…), and prefers an address on the business's own domain. Email coverage tracks how many businesses in your category have a website — high for agencies and commerce, lower for small trades. Toggle off with Enrich email from website.
Compliance (France / EU): these are public professional emails. For cold outreach under French B2B (CNIL opt-out) rules, your first email must identify you, state its purpose and the source of the data, and offer a one-click opt-out; avoid tracking pixels. You remain the data controller.
How much does it cost?
Pay-per-result: scraping 1,000 businesses costs about $0.99 (see the Pricing tab for the exact current price). No monthly fee; failed runs aren't charged.
| Run size | Approx. cost |
|---|---|
| 100 businesses | $0.10 |
| 500 businesses | $0.50 |
| 1,000 businesses | $0.99 |
| 5,000 businesses | $4.95 |
The residential French proxy is billed to your own Apify account as platform usage.
Reliability
PagesJaunes uses Cloudflare protection. This Actor uses a resilient multi-tier transport — fresh-IP rotation on blocks with a browser fallback for persistent challenges — on both listing and detail pages, so runs stay green where single-tier scrapers fail. A residential FR proxy is prefilled (required — Cloudflare blocks datacenter IPs).
FAQ
Do I need a proxy or API key? No external API key. A residential French proxy is prefilled and uses Apify's own Residential pool (billed to your account).
What export formats? JSON, CSV, Excel, XML — from the dataset page or the Apify REST API.
Why is email empty for some businesses? Email exists only when the business has a website that publishes one. Many small trades have neither.
Can I schedule runs? Yes — Apify Scheduler, n8n, Make, or Zapier.
Is scraping PagesJaunes legal? This Actor extracts only publicly visible, non-protected business data for legitimate B2B research. You remain the data controller for any outreach — follow GDPR / CNIL B2B rules.
Use with AI agents (MCP)
This scraper is callable as a tool by AI agents (Claude Desktop, Cursor, VS Code, n8n, LangGraph, CrewAI, or any MCP-compatible client) via Apify's hosted Model Context Protocol server — e.g. "find plumbers in Paris with phone and SIRET", "list real-estate agencies in Lyon with emails for outreach".
{"mcpServers": {"apify": {"command": "npx","args": ["mcp-remote","https://mcp.apify.com/?tools=bovi/pagesjaunes-directory","--header","Authorization: Bearer <YOUR_APIFY_TOKEN>"]}}}
Minimal call (returns up to 40 results by default):
{ "city": "lyon-69", "category": "agences-immobilieres", "maxItems": 20 }
Returns clean, flat rows the agent can reason over directly:
{"name": "Régie Thiébaud","phone": "04 78 24 35 09","email": "rtcb@regiethiebaud.fr","website": "http://www.regie-thiebaud.fr","street": "12 Rue de la République","postalCode": "69002","city": "Lyon","latitude": 45.764065,"longitude": 4.858097,"siret": "89428783800029","siren": "894287838","categoryType": "RealEstateAgent","rating": 4.3,"reviewCount": 18,"parse_confidence": 1.0,"warnings": []}
Integrations
The JSON/dataset output drops into the tools you already run, no glue code:
- n8n / Make / Zapier — pipe every new dataset item into 500+ apps (Google Sheets, Airtable, Slack, HubSpot, your CRM): n8n, Make, Zapier.
- Webhooks — fire your endpoint the moment a run finishes (docs).
- MCP server — expose this Actor as a tool to Claude, Cursor, or any MCP client.
- API & SDKs — fetch the dataset as JSON/CSV/Excel via the Apify REST API or Python/JS SDKs.
See all Apify integrations.
Disclaimer
This Actor is not affiliated with or endorsed by PagesJaunes / Solocal Group. It accesses publicly available data only.
More scrapers from our toolkit
Building a data pipeline? These actors pair well with this one — each runs on your own Apify account with the same pay-per-result pricing, no subscription:
- 2GIS Places Scraper
- Yellowpages Scraper
- Gelbeseiten Directory
- Google Maps Scraper
- Google Maps Leads
- Companies France
Chain any of them together from the Integrations tab (the Run succeeded trigger) to build a multi-step workflow — one actor's output feeds the next.
Use it from your existing tools
Use with Claude Desktop / Cursor / Cline (MCP)
Load this actor as a tool in your AI assistant. Call it directly from your AI assistant via the Apify MCP server — no Store browsing needed. Paste this into your MCP client config (e.g. claude_desktop_config.json) and restart the client:
{"mcpServers": {"apify-pagesjaunes-directory": {"command": "npx","args": ["-y","@apify/actors-mcp-server","--tools","bovi/pagesjaunes-directory"],"env": {"APIFY_TOKEN": "YOUR_APIFY_TOKEN"}}}}
Replace YOUR_APIFY_TOKEN with your own Apify API token (free at apify.com → Settings → Integrations). Curated to a handful of tools so the agent selects reliably.
Works with Clay
Run this actor as an HTTP enrichment step inside a Clay table:
- Method:
POST - URL:
https://api.apify.com/v2/acts/bovi~pagesjaunes-directory/run-sync-get-dataset-items?token={{apify_token}} - Body (JSON): map your Clay columns to the actor input (see the Input section above), e.g.
{"city": "{{clay_column}}"}
The run finishes synchronously and returns the dataset rows straight into your Clay table. It runs on Apify's cloud under your own token and usage. Synchronous runs must complete within 300 seconds.