Company Website Enricher — B2B Lead Intelligence avatar

Company Website Enricher — B2B Lead Intelligence

Pricing

from $3.00 / 1,000 lead enricheds

Go to Apify Store
Company Website Enricher — B2B Lead Intelligence

Company Website Enricher — B2B Lead Intelligence

Extract company info, emails, phone numbers, social media profiles, and technology stack from any website. Pure HTTP scraping, no browser needed. Perfect for B2B lead enrichment, competitive intelligence, and sales prospecting.

Pricing

from $3.00 / 1,000 lead enricheds

Rating

0.0

(0)

Developer

Roman Bednář

Roman Bednář

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Company Website Enricher

Extract structured company intelligence from any website using plain HTTP requests. Give it a list of domains and get back emails, phone numbers, social media profiles, technology stack, and company metadata — ready for CRM import, lead scoring, or competitive analysis.

What data do you get?

For each domain, the actor crawls the homepage and key subpages (/about, /contact, /team) and returns:

FieldDescriptionExample
companyNameCompany name from structured data, Open Graph, or title tagApify
descriptionCompany description from meta tagsThousands of tools to automate your business...
emailsEmail addresses found on the website["hello@apify.com"]
phoneNumbersPhone numbers from tel: links and visible text["+1 (555) 123-4567"]
socialProfilesLinkedIn, X/Twitter, Facebook, Instagram, YouTube, GitHub{"linkedin": "https://linkedin.com/company/apify", ...}
techStackTechnologies detected from script sources, meta tags, and HTTP headers["Next.js", "HubSpot", "Google Tag Manager"]
logoUrlCompany logo or Open Graph image URLhttps://apify.com/img/og/landing.png
languagePage language from HTML lang attributeen
pagesCrawledNumber of pages analyzed for this domain3

Use cases

  • B2B Sales Prospecting — Enrich your lead lists with emails, phone numbers, and social profiles before outreach
  • Competitive Intelligence — Discover what technologies your competitors use (CMS, analytics, marketing tools)
  • Market Research — Profile hundreds of companies in a target market to identify technology trends
  • CRM Enrichment — Bulk-enrich your CRM contacts with missing company data
  • Lead Scoring — Use tech stack and social presence as signals for lead qualification
  • Agency Pitching — Identify prospects using outdated technology or missing key tools

How it works

  1. You provide a list of company domains (e.g., apify.com, stripe.com)
  2. The actor fetches each website's homepage via HTTP (no browser — fast and cheap)
  3. It discovers and crawls relevant subpages (/about, /contact, /team, etc.)
  4. Data is extracted from HTML structure, meta tags, HTTP headers, and visible text
  5. Results are deduplicated, merged across pages, and pushed to the dataset

No browser rendering, no JavaScript execution — pure HTTP requests with Cheerio parsing. This makes it fast, lightweight, and cost-effective.

Technologies detected

The actor identifies 40+ technologies across these categories:

CategoryExamples
CMSWordPress, Shopify, Wix, Squarespace, Webflow, Drupal, Joomla, Ghost, Magento
FrameworksNext.js, Nuxt.js, Angular, Vue.js, Gatsby, Docusaurus
AnalyticsGoogle Analytics, Google Tag Manager, Segment, Mixpanel, Amplitude, Hotjar
MarketingHubSpot, Mailchimp, Optimizely, LaunchDarkly
SupportIntercom, Zendesk, Drift, Crisp, LiveChat
PaymentsStripe, PayPal
InfrastructureCloudflare, Cloudinary, Imgix, Algolia, Sentry, Recaptcha
JS LibrariesjQuery, Bootstrap, Tailwind CSS, Font Awesome
ServersNginx, Apache, Microsoft IIS (from HTTP headers)

Detection uses structural indicators (script/link URLs, specific DOM markers) rather than keyword matching, which eliminates false positives.

Input

ParameterTypeDefaultDescription
domainsstring[]requiredList of company domains or URLs to enrich. Examples: "apify.com", "https://stripe.com"
maxPagesPerDomaininteger5Maximum pages to crawl per domain (homepage + subpages). More pages = more data but slower
extractEmailsbooleantrueExtract email addresses
extractPhonesbooleantrueExtract phone numbers
extractSocialLinksbooleantrueExtract social media profile links
detectTechStackbooleantrueDetect website technologies
proxyConfigurationobjectProxy settings for the crawler

Example input

{
"domains": ["hubspot.com", "zendesk.com"],
"maxPagesPerDomain": 5,
"extractEmails": true,
"extractSocialLinks": true,
"detectTechStack": true
}

Output

Each domain produces one result object in the dataset. Here are real results from running the actor:

Example 1: HubSpot — social profiles + global phone numbers

{
"domain": "hubspot.com",
"url": "https://hubspot.com",
"companyName": "HubSpot",
"description": "HubSpot's AI-powered customer platform provides the tools your business needs to grow better.",
"logoUrl": "https://www.hubspot.com/hubfs/HubSpot_Logos/HubSpot-Inversed-Favicon.png",
"emails": [],
"phoneNumbers": [
"18884827768",
"+35315187500",
"+6569556000",
"+61291648000",
"+813-4520-9500",
"+4930208486000",
"+442073243700"
],
"socialProfiles": {
"linkedin": "https://www.linkedin.com/company/hubspot",
"twitter": "https://x.com/HubSpot",
"facebook": "https://www.facebook.com/hubspot",
"instagram": "https://www.instagram.com/hubspot",
"youtube": "https://youtube.com/user/HubSpot",
"github": null
},
"techStack": [
"Cloudflare",
"HubSpot",
"jQuery"
],
"language": "en",
"pagesCrawled": 5,
"enrichedAt": "2026-06-26T08:54:23.158Z"
}

The actor discovered HubSpot's global contact page and extracted phone numbers for offices in the US, Ireland, Singapore, Australia, Japan, Germany, and the UK — all from a single domain input.

Example 2: Zendesk — emails + tech stack

{
"domain": "zendesk.com",
"url": "https://zendesk.com",
"companyName": "Zendesk",
"description": "Move beyond deflection with AI agents that resolve issues end-to-end.",
"logoUrl": "https://d1eipm3vz40hy0.cloudfront.net/images/logos/favicons/zendesk-image.png",
"emails": [
"ask.philippines@zendesk.com",
"ask.thailand@zendesk.com",
"ask.indonesia@zendesk.com",
"ask.malaysia@zendesk.com",
"ask.gcr@zendesk.com"
],
"phoneNumbers": [
"18888519456"
],
"socialProfiles": {
"linkedin": "https://www.linkedin.com/company/zendesk",
"twitter": "https://www.x.com/zendesk",
"facebook": "https://www.facebook.com/zendesk",
"instagram": "https://www.instagram.com/zendesk",
"youtube": null,
"github": null
},
"techStack": [
"Cloudflare",
"Next.js",
"Optimizely",
"Zendesk"
],
"language": "en-US",
"pagesCrawled": 5,
"enrichedAt": "2026-06-26T08:54:44.387Z"
}

The actor probed Zendesk's contact pages and found regional sales emails, detected they use Optimizely for A/B testing, and identified they run on their own Zendesk platform.

Results can be exported as JSON, CSV, Excel, XML, or accessed via the Apify API.

How emails are extracted

  • Scans mailto: links (most reliable source)
  • Pattern-matches email addresses in visible page text (script/style/SVG content is stripped first)
  • Filters out noise: noreply addresses, system domains (sentry.io, schema.org, etc.), and image file extensions
  • Deduplicates across all crawled pages

How social profiles are detected

  • Scans all <a href> links for LinkedIn, X/Twitter, Facebook, Instagram, YouTube, and GitHub URLs
  • Excludes share/intent/login links (e.g., facebook.com/sharer is ignored)
  • Normalizes Twitter/X URLs to x.com
  • Returns the company's actual profile, not generic platform links

Performance

  • Speed: ~200-400ms per page (HTTP only, no browser overhead)
  • Memory: 256 MB minimum, works well at default settings
  • Throughput: Processes 10 domains concurrently by default
  • Cost: Lightweight — uses minimal compute and no browser instances

Integrations

This actor works with the full Apify ecosystem:

  • API — Call via REST API or Apify client libraries (JavaScript, Python)
  • Scheduling — Run on a schedule to keep your company data fresh
  • Webhooks — Get notified when a run finishes
  • Zapier / Make / n8n — Connect to your automation workflows
  • Google Sheets — Export results directly to a spreadsheet

Limitations

  • JavaScript-rendered content: Since this actor uses HTTP requests (no browser), it cannot extract data from websites that require JavaScript to render their content. Most company websites serve key content in the initial HTML.
  • Phone numbers: Uses a conservative regex to avoid false positives. Some phone numbers in unusual formats may be missed.
  • Paywalled content: Cannot access content behind login walls.
  • Anti-bot protection: Some websites with aggressive bot protection (Cloudflare challenges, CAPTCHAs) may block requests. Use proxy configuration to improve success rates.