Similarweb Scraper - Traffic, AI Traffic & WHOIS
Pricing
from $0.80 / 1,000 domains
Similarweb Scraper - Traffic, AI Traffic & WHOIS
🔍 Spy on any website in seconds: traffic, rankings, top keywords, AI traffic share (ChatGPT/Claude/Gemini), competitors, similar sites & WHOIS — all from Similarweb. No login or API key. Bulk parallel scrape, captcha-resilient. Export to JSON/CSV/Excel. SEO, lead gen, research.
Pricing
from $0.80 / 1,000 domains
Rating
5.0
(1)
Developer
VortexData
Maintained by CommunityActor stats
3
Bookmarked
23
Total users
21
Monthly active users
8 hours ago
Last modified
Categories
Share
🔍 Similarweb Scraper
📊 Website intelligence for any domain in seconds. Start with one website, choose traffic/rankings, similar sites, or WHOIS + homepage keywords, then export the results as JSON, CSV, Excel or any other format Apify supports.
💎 What is Similarweb Scraper?
Similarweb Scraper is a fast, captcha-resilient web scraper that pulls
the same data the Similarweb web app shows you — without requiring a
Similarweb account, login, or API key. Behind the scenes it talks to
Similarweb's own SPA data endpoint using a real Chrome TLS / JA3
fingerprint via curl_cffi
and routes every request through a fresh Apify Residential proxy
session, so you get reliable, production-grade data for any domain.
You can start with a single domain, then scale to a whole list when you are ready. Pick one dataset mode per run and the Actor returns clean records ready to drop into a spreadsheet, BI tool, warehouse, or AI agent.
🚀 What can Similarweb Scraper do?
- 🗂️ Choose one of three dataset modes for each run:
- 📊 Base data — global / country / category ranks, monthly visits, bounce rate, pages per visit, time on site, traffic-source split (direct, search, referral, social, paid, mail), top organic keywords with volume and CPC, AI traffic share per LLM (ChatGPT / Claude / Gemini / Perplexity / Copilot).
- 🪞 Similar sites — competitors and alternatives with their traffic, category and ranking.
- 🆔 AITDK — WHOIS via RDAP (registrar, registration / expiration dates, name servers, EPP status, DNSSEC) plus on-page keyword density analysis of the domain's homepage.
- ⚡ Start small or run in bulk — one domain is enough for a test run; larger batches process up to 10 domains concurrently by default.
- 🛡️ Captcha-resilient — uses Similarweb's open SPA endpoint that
serves
200 OKto Chrome TLS fingerprints, no captcha solving required for base data and similar sites. - 🔄 Per-request IP rotation — every HTTP call gets a fresh Apify Residential proxy session, so a blocked address costs at most one attempt.
- 🌐 Three input formats — accepts
example.com,www.example.com, orhttps://example.com. The domain is extracted automatically.
☁️ Remember the Apify platform
Running this Actor on Apify gives you everything that comes with the platform out of the box: managed Residential proxies with global exit IPs, scheduling (run hourly / daily / weekly), free storage in Apify Datasets with export to JSON / CSV / Excel / JSONL / XML / RSS, webhooks and integrations (Make, Zapier, n8n, Google Sheets, Slack, Airtable, Pipedream), and a REST API + Python / JavaScript SDKs to plug results into your own pipelines.
🗝️ What data can this Actor extract?
| Field group | Examples |
|---|---|
| Rankings | Global rank · country rank · category rank |
| Engagement | Total visits · monthly visits (3 months) · bounce rate · pages / visit · time on site |
| Traffic sources | Direct · search · referral · social · paid · mail (as shares) |
| AI traffic share | ChatGPT · Claude · Gemini · Perplexity · Copilot — current + 3-month history |
| Top keywords | Keyword · estimated value · search volume · CPC |
| Country breakdown | Top countries with share + monthly visit estimates per country |
| Similar sites | Up to 20 related sites with traffic, ranks, category and thumbnails |
| WHOIS (via RDAP) | Registrar · IANA ID · registration / expiration / last-changed dates · name servers · EPP status |
| Keyword density | Top-20 non-stopword tokens from the homepage with count and density |
| Assets | Desktop / mobile screenshots · favicon |
🎯 How to use Similarweb Scraper
- Click Try for free on the Actor's Apify Store page.
- In the Domains field, enter one website to test, or paste a
larger list later — one per line, any format (
example.com,www.example.com, orhttps://example.com). - Pick exactly one Dataset to fetch. Start with
base_dataif you want the standard Similarweb dashboard data. - Click Start. A one-domain test run is fine; there is no 10-domain
minimum.
aitdktakes longer than the other modes because it also fetches RDAP and the homepage. - When the run finishes, open Storage → Dataset and export to
JSON, CSV, Excel, JSONL, XML or RSS. Or pull the results through the
API:
https://api.apify.com/v2/datasets/{dataset_id}/items.
📥 Input
The form has two fields only — everything else has sensible defaults:
| Field | Type | Default |
|---|---|---|
domains | array | required — minimum 1 domain |
datasets | enum | base_data — one of base_data, similar_sites, aitdk |
Example input
{"domains": ["openai.com"],"datasets": "base_data"}
📤 Output
Each domain produces one dataset item. Each item conforms to the dataset schema and is rendered in the Apify Console views that match the selected dataset mode: 📊 Overview · 🚦 Traffic sources · 💫 Engagement · 🤖 AI traffic share · 🆔 AITDK (WHOIS and keywords).
Example item (abridged)
{"domain": "openai.com","rankGlobal": 207,"country": "US","countryRank": 306,"category": "ai_chatbots_and_tools","categoryRank": 6,"title": "OpenAI","totalVisits": 195737812,"bounceRate": 0.5937,"pagesPerVisit": 2.59,"timeOnSite": 138.72,"socialTraffic": 0.0287,"searchTraffic": 0.2154,"directTraffic": 0.3840,"referralTraffic": 0.1038,"aiTrafficShareChatgpt": 0.8825,"aiTrafficShareClaude": 0.0029,"aiTrafficShareGemini": 0.0106,"topKeywords": [{"keyword": "chatgpt", "estimatedValue": 20907500.0, "searchVolume": 173339160.0, "cpc": 0.14},{"keyword": "chat gpt", "estimatedValue": 5688810.0, "searchVolume": 95011780.0, "cpc": 0.14}]}
🔗 Integrate Similarweb Scraper anywhere
Apify Actors run on a REST API — every run, dataset and webhook is addressable from your code:
# Trigger a run from anywherecurl -X POST "https://api.apify.com/v2/acts/<USER>~similarweb-scraper/runs?token=<API_TOKEN>" \-H "Content-Type: application/json" \-d '{"domains": ["openai.com"], "datasets": "base_data"}'# Read results from the run's default datasetcurl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?format=json"
Or use the official Python and JavaScript clients.
❓ FAQ
Can I test it with only one domain?
Yes. The input requires at least one domain, not ten. The default
example uses openai.com with base_data, so a new user can click
Start immediately.
🔑 Do I need a Similarweb account or API key? No. This Actor talks to Similarweb's public SPA endpoint directly. No login, no API key, and no scraping the captcha-gated in-depth pages.
🆕 Is the data fresh?
Yes — it's the same JSON Similarweb's UI loads. The snapshotDate
field on every record tells you exactly which month it represents.
Similarweb refreshes its traffic data monthly.
⏰ Can I run this on a schedule? Yes — open the Actor in Apify Console, go to Schedules and pick hourly, daily, weekly or a custom cron. Combine with webhooks to push fresh data into Google Sheets, Slack, Make, Zapier or your own backend automatically.
⚖️ Is web scraping legal? Public web pages are generally legal to scrape, but you must respect copyright, terms of service, and personal-data protection laws (GDPR in the EU and similar regulations elsewhere). This Actor only extracts publicly visible data — no personal data is collected. See Apify's legal blog for details.
💬 Support
- Found a bug or have a feature request? Open an issue in the Actor's Issues tab on Apify Console.
- Questions about Apify itself? Visit docs.apify.com or the Apify Discord community.
📝 Changelog
See CHANGELOG.md for the full release history. The Actor follows Semantic Versioning.