# Website Metadata Extractor(sitemap, socialLinks, robotsTxt) (`codescraper/website-metadata-extractor`) Actor

A very fast metadata extractor to get all meta tags, robots.txt, sitemaps, social links, H1s, word count, and JSON-LD data. Also provides technology detection for a full analysis. Get your data fast for just $3/month.

- **URL**: https://apify.com/codescraper/website-metadata-extractor.md
- **Developed by:** [CodeScraper](https://apify.com/codescraper) (community)
- **Categories:** SEO tools, Developer tools, Automation
- **Stats:** 100 total users, 3 monthly users, 100.0% runs succeeded, 2 bookmarks
- **User rating**: No ratings yet

## Pricing

$3.00/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🌐 Website Metadata Extractor – Analyze & Extract Site SEO, Tags, and Robots Data

This **Apify actor** crawls and extracts **structured metadata** from any website, including SEO tags, robot rules, sitemap references, headings, and technology fingerprints.
It’s ideal for **SEO analysis**, **data enrichment**, **domain audits**, and **competitive research** — built using **Crawlee + Cheerio** for lightweight yet powerful web parsing.

---

### 🚀 What It Does

The actor processes each input URL and extracts:

- 🧠 **Meta Information:** Title, description, keywords, charset, viewport, theme color, etc.
- 🤖 **Robots.txt Rules:** Indexed by user-agent with all allow/disallow paths.
- 🗺️ **Sitemaps:** All discovered sitemap file URLs.
- 🧩 **Social Metadata:** Open Graph (`og:`), Twitter Cards, and Facebook verification tags.
- 🧱 **Structured Data:** Extracted JSON-LD schemas.
- 💬 **Content Structure:** First H1, all H1s & H2s, and total visible word count.
- 🕵️ **Technologies:** Detected frameworks, analytics tools, or CMS hints.
- 🌍 **Hreflang Tags:** Alternate language/country versions of the site.

---

### 💡 It Handles:

- ✅ Website metadata extraction from any URL
- ✅ Robots.txt and sitemap discovery
- ✅ JSON-LD and SEO tag parsing
- ✅ Automatic detection of CMS and tech stacks
- ✅ Fast mode (skip domain analysis) for large-scale scans
- ✅ Works with static and dynamic pages

---

### ⚙️ Input Configuration

| Field                   | Type    | Description                                                                   | Default / Example                                 |
| ----------------------- | ------- | ----------------------------------------------------------------------------- | ------------------------------------------------- |
| `startUrls`             | Array   | A list of website URLs to analyze.                                            | `["https://apify.com", "https://www.google.com"]` |
| `disableDomainAnalysis` | Boolean | If `true`, skips robots.txt and sitemap fetching for faster runs (less data). | `false`                                           |

---

### 🧩 Example Input

```json
{
  "startUrls": ["https://www.shopify.com", "https://www.wix.com"],
  "disableDomainAnalysis": false
}
````

***

### 📊 Example Output

```json
{
  "url": "https://www.shopify.com",
  "robotsTxt": {
    "userAgents": {
      "GoogleDocs": {
        "allow": [],
        "disallow": ["/"]
      },
      "AdIdxBot": {
        "allow": ["*/ppc/*", "*utm_medium=cpc*"],
        "disallow": []
      },
      "Pinterestbot": {
        "allow": ["*/ppc/*", "*utm_medium=cpc*"],
        "disallow": []
      },
      "AdsBot-Google-Mobile": {
        "allow": ["*/ppc/*", "*utm_medium=cpc*"],
        "disallow": []
      },
      "AdsBot-Google": {
        "allow": ["*/ppc/*", "*utm_medium=cpc*"],
        "disallow": []
      },
      "Bingbot": {
        "allow": [],
        "disallow": ["/llms.txt"]
      },
      "Googlebot": {
        "allow": [],
        "disallow": ["/llms.txt"]
      },
      "*": {
        "allow": [],
        "disallow": [
          "*.data$",
          "*/account",
          "*/auth/callback",
          "*/authenticate",
          "*/authentication",
          "*/editor$",
          "*/finalize$",
          "*/loading$",
          "*/onboarding$",
          "*/ppc/*",
          "*/result$",
          "*/services/identity/login",
          "*/services/sa-appointments/login",
          "*/services/sa-appointments/redirect",
          "*/services/sa-appointments/stores.json",
          "*/stock-photos/*?*page=*?*page=*",
          "*/stock-photos/*?link_search=",
          "*/stock-photos/*?q=",
          "*/stock-photos/@*",
          "*/stock-photos/admin",
          "*/stock-photos/photos/search$",
          "*/stock-photos/photos/search?",
          "*/store/account",
          "*/store/admin",
          "*/store/cart",
          "*/store/carts",
          "*/store/checkout",
          "*/store/checkouts/",
          "*/store/orders",
          "*/tools/*/show*",
          "*country=*",
          "*itcat=*",
          "*itterm=*",
          "*lang=*",
          "*link_search=*",
          "*page=*",
          "*prev_msid=*",
          "*utm_medium=cpc*",
          "/*/*?*shpxid=*",
          "/*/*?services*",
          "/*/*digital_wallets/dialog",
          "/*/admin/",
          "/*/apple-app-site-association",
          "/*/blog-article-remove-faq-utms-*.js",
          "/*/blog.atom",
          "/*/blog/search$",
          "/*/blog/search?",
          "/*/blogs/blog.atom",
          "/*/blogs/technology.atom",
          "/*/blogsearch$",
          "/*/cannabis",
          "/*/cdn-cgi/challenge-platform*",
          "/*/email-validation",
          "/*/enterprise/blog/search$",
          "/*/enterprise/blog/search?",
          "/*/growth-tools-assets",
          "/*/landing/",
          "/*/retail/search$",
          "/*/retail/search?",
          "/*/step/",
          "/*/submit",
          "/*/submitted",
          "/*/technology.atom",
          "/*/technology/tagged/*page*",
          "/*/tools/business-name-generator/searchbutton*",
          "/*/tools/business-name-generator/searchpage*",
          "/*/tools/business-name-generator/searchutf8*",
          "/*/ventureone",
          "/*CampaignId*",
          "/*hashid=%subscriber_hash%",
          "/*hubs_content*",
          "/.s/assets/shopifycloud/collabs-community-widget/widget.js",
          "/.well-known/traffic-advice",
          "/500",
          "/__dux",
          "/__manifest",
          "/__pb/mm",
          "/__pb/rp",
          "/__pb/rr",
          "/authenticate",
          "/blog-article-remove-faq-utms-*.js",
          "/blog/search$",
          "/blog/search?",
          "/blogs/technology.atom",
          "/blogsearch",
          "/cannabis",
          "/careers/portal/*",
          "/careers/search?*",
          "/cdn-cgi/challenge-platform*",
          "/email-validation",
          "/enterprise/blog/search$",
          "/enterprise/blog/search?",
          "/meta.json",
          "/oauth",
          "/retail/search$",
          "/retail/search?"
        ]
      }
    }
  },
  "sitemapFileUrls": [
    "https://www.shopify.com/sitemaps_list.xml",
    "https://www.shopify.com/sitemap.xml"
  ],
  "metaTags": {
    "title": "Shopify: The All-in-One Commerce Platform for Businesses - Shopify",
    "viewport": "width=device-width,initial-scale=1",
    "description": "Try Shopify free and start a business or grow an existing one. Get more than ecommerce software with tools to manage every part of your business.",
    "fb:pages": "20409006880",
    "fb:app_id": "847460188612391",
    "og:type": "website",
    "og:site_name": "Shopify",
    "og:title": "Shopify: The All-in-One Commerce Platform for Businesses - Shopify",
    "og:description": "Try Shopify free and start a business or grow an existing one. Get more than ecommerce software with tools to manage every part of your business.",
    "og:image": "https://cdn.shopify.com/b/shopify-brochure2-assets/ad99387e77223a56caca1c7b18e31970.png",
    "twitter:image": "https://cdn.shopify.com/b/shopify-brochure2-assets/ad99387e77223a56caca1c7b18e31970.png",
    "og:url": "https://www.shopify.com/",
    "twitter:card": "summary_large_image",
    "twitter:site": "Shopify",
    "twitter:account_id": "17136315",
    "twitter:title": "Shopify: The All-in-One Commerce Platform for Businesses - Shopify",
    "twitter:description": "Try Shopify free and start a business or grow an existing one. Get more than ecommerce software with tools to manage every part of your business.",
    "ssr": "false",
    "canonical": "https://www.shopify.com/",
    "charset": "utf-8",
    "favicon": "https://cdn.shopify.com/shopifycloud/web/assets/v1/favicon-default-6cbad9de243dbae3.ico",
    "apple-touch-icon": "https://cdn.shopify.com/b/shopify-brochure2-assets/c97c60ca19c64a8b5378d9f9e971f7bd.png"
  },
  "jsonLd": [
    {
      "@context": "https://schema.org/",
      "@type": "Corporation",
      "name": "Shopify",
      "url": "https://www.shopify.com/",
      "logo": "https://cdn.shopify.com/shopifycloud/brochure/assets/brand-assets/shopify-logo-primary-logo-456baa801ee66a0a435671082365958316831c9960c480451dd0330bcdae304f.svg",
      "sameAs": [
        "https://github.com/Shopify",
        "https://en.wikipedia.org/wiki/Shopify",
        "https://www.youtube.com/shopify",
        "https://www.instagram.com/shopify/?hl=en",
        "https://x.com/Shopify",
        "https://www.linkedin.com/company/shopify/",
        "https://www.facebook.com/shopify/"
      ]
    }
  ],
  "hreflangLinks": {
    "es-AR": "https://www.shopify.com/ar",
    "en-AU": "https://www.shopify.com/au",
    "de-AT": "https://www.shopify.com/at",
    "ru": "https://www.shopify.com/by",
    "nl-BE": "https://www.shopify.com/be",
    "de-BE": "https://www.shopify.com/be-de",
    "fr-BE": "https://www.shopify.com/be-fr",
    "pt": "https://www.shopify.com/br",
    "bg": "https://www.shopify.com/bg",
    "en-CA": "https://www.shopify.com/ca",
    "fr-CA": "https://www.shopify.com/ca-fr",
    "es-CL": "https://www.shopify.com/cl",
    "es-CO": "https://www.shopify.com/co",
    "cs": "https://www.shopify.com/cz",
    "da": "https://www.shopify.com/dk",
    "fi": "https://www.shopify.com/fi",
    "fr": "https://www.shopify.com/fr",
    "de": "https://www.shopify.com/de",
    "el": "https://www.shopify.com/gr",
    "zh-HK": "https://www.shopify.com/hk",
    "en-HK": "https://www.shopify.com/hk-en",
    "hu": "https://www.shopify.com/hu",
    "hi": "https://www.shopify.com/in-hi",
    "en-IN": "https://www.shopify.com/in",
    "id": "https://www.shopify.com/id-id",
    "en-ID": "https://www.shopify.com/id",
    "en-IE": "https://www.shopify.com/ie",
    "it": "https://www.shopify.com/it",
    "ja-JP": "https://www.shopify.com/jp",
    "ko": "https://www.shopify.com/kr",
    "lt": "https://www.shopify.com/lt",
    "en-MY": "https://www.shopify.com/my",
    "es-MX": "https://www.shopify.com/mx",
    "nl": "https://www.shopify.com/nl",
    "en-NZ": "https://www.shopify.com/nz",
    "en-NG": "https://www.shopify.com/ng",
    "nb": "https://www.shopify.com/no",
    "en-NO": "https://www.shopify.com/no-en",
    "es-PE": "https://www.shopify.com/pe",
    "en-PH": "https://www.shopify.com/ph",
    "pl": "https://www.shopify.com/pl",
    "pt-PT": "https://www.shopify.com/pt",
    "ro": "https://www.shopify.com/ro",
    "en-SG": "https://www.shopify.com/sg",
    "en-ZA": "https://www.shopify.com/za",
    "es-ES": "https://www.shopify.com/es-es",
    "sv": "https://www.shopify.com/se",
    "de-CH": "https://www.shopify.com/ch",
    "zh-Hant-TW": "https://www.shopify.com/tw",
    "th": "https://www.shopify.com/th",
    "tr": "https://www.shopify.com/tr",
    "en-GB": "https://www.shopify.com/uk",
    "en": "https://www.shopify.com",
    "es": "https://www.shopify.com/es",
    "zh-Hans": "https://www.shopify.com/zh"
  },
  "socialLinks": {
    "facebook": "https://www.facebook.com/shopify",
    "twitter": "https://twitter.com/shopify",
    "youtube": "https://www.youtube.com/user/shopify",
    "instagram": "https://www.instagram.com/shopify/",
    "tiktok": "https://www.tiktok.com/@shopify",
    "linkedin": "https://www.linkedin.com/company/shopify",
    "pinterest": "https://www.pinterest.com/shopify/",
    "github": "https://github.com/Shopify",
    "x": "https://x.com/Shopify"
  },
  "h1": "Be the nextbig thing",
  "allH1s": ["Be the nextbig thing"],
  "allH2s": [
    "store they line up for",
    "The one commerce platform behind it all",
    "There’s no better place for you to build",
    "Start selling in no time"
  ],
  "wordCount": 969
}

{
	"url": "https://www.wix.com",
	"robotsTxt": {
		"userAgents": {
			"*": {
				"allow": [],
				"disallow": [
					"/api/",
					"/blogtemp",
					"/wixblog",
					"/bo/",
					"/editor.jsp",
					"/noflashhtml",
					"/siteBackHtml",
					"/wix/",
					"/wixpress/",
					"/wixdemo/",
					"/wix-editor/",
					"/editor2.jsp",
					"/flash/",
					"/flash-templates/",
					"/website-template/view/flash/",
					"/facebook-template/",
					"/facebook/templates/",
					"/website/templates/flash/",
					"/html5/",
					"/my-account",
					"/blog/*sitemap.xml$",
					"/velo/profile",
					"/velo/forum/search",
					"/velo/forum/main/comment",
					"/*velo-pt/profile",
					"/*velo-pt/forum/search",
					"/*velo-pt/forum/main/comment",
					"/app-market/search-result?query=",
					"/despiration/testseo",
					"/site-react-dropdown*",
					"/soundcloud-tpa*",
					"/*cacheKiller=",
					"/*hubs_content",
					"/*hubs_post",
					"/*hubs_signup-cta",
					"/lp-lang/mobilewebsite",
					"/mystunningwebsites/",
					"/dashboard",
					"/_api/albums-node-server",
					"/_partials/wix-vod",
					"/designers/events/fullscreen-page",
					"/photo-test-editor-x",
					"/thunderbolt/",
					"/meta-site/",
					"/learn/search-results",
					"/?criteria=",
					"/blog/*/search-results?q",
					"/blog/search-results?q",
					"/experts-arena/",
					"/createawebsite/",
					"/website/templates*?screen=",
					"/*?sort=",
					"/marketplace/hire*?serviceIds=",
					"/marketplace/hire*&serviceIds=",
					"/marketplace/brief/",
					"/wix-platform",
					"/blog-temp-ru",
					"/blog/search?*",
					"*/fullscreen-page",
					"/wearepages",
					"*/laboratory/conductAllInScope",
					"/studio/blog/*?filter=",
					"/studio/academy/search-results",
					"/studio/academy/search",
					"/studio/blog/search-results?q",
					"/studio/blog/*&filter=",
					"/seo/learn/assets*?topic=",
					"/seo/learn/assets*&topic=",
					"/seo/learn/assets*?resource-type=",
					"/seo/learn/assets*&resource-type=",
					"/studio-tech-desgin",
					"/_api/dealer-offer-events-service/proxy/v1/dealer-offer-events",
					"/website/builder?storyId=",
					"/studio-tech-design",
					"/domain/names/results?",
					"/business-name-generator/list",
					"/studio/templates?criteria",
					"/domains/results",
					"/_api/header-footer-service/content",
					"/wixel/templates/search?",
					"/wixel/templates/*?color",
					"/wixel/templates/*?style",
					"/wixel/templates/*?price",
					"/marketplace/templates?sort="
				]
			}
		}
	},
	"sitemapFileUrls": [
		"https://www.wix.com/sitemap.xml"
	],
	"metaTags": {
		"title": "Website Builder - Create a Free Website In Minutes | Wix.com",
		"viewport": "width=device-width, initial-scale=1",
		"generator": "Wix.com Website Builder",
		"format-detection": "telephone=no",
		"skype_toolbar": "skype_toolbar_parser_compatible",
		"description": "Get everything you need to create your website, your way. With a free easy-to-use website builder, integrated hosting, and essential business solutions.",
		"og:title": "Your vision. Your goals. Your website. | Wix.com",
		"og:description": "Get everything you need to create your website, your way. From an intuitive website builder to built-in business solutions and AI tools—try Wix for free.",
		"og:image": "https://static.wixstatic.com/media/0784b1_31d34d128f8b424cb8cb53607fe77c95~mv2.jpg/v1/fill/w_1200,h_630,al_c/0784b1_31d34d128f8b424cb8cb53607fe77c95~mv2.jpg",
		"og:image:width": "1200",
		"og:image:height": "630",
		"og:url": "https://www.wix.com",
		"og:site_name": "wix.com",
		"og:type": "website",
		"twitter:card": "summary_large_image",
		"twitter:title": "Your vision. Your goals. Your website. | Wix.com",
		"twitter:description": "Get everything you need to create your website, your way. From an intuitive website builder to built-in business solutions and AI tools—try Wix for free.",
		"twitter:image": "https://static.wixstatic.com/media/0784b1_31d34d128f8b424cb8cb53607fe77c95~mv2.jpg/v1/fill/w_1200,h_630,al_c/0784b1_31d34d128f8b424cb8cb53607fe77c95~mv2.jpg",
		"canonical": "https://www.wix.com",
		"charset": "utf-8",
		"favicon": "https://www.wix.com/favicon.ico",
		"apple-touch-icon": "https://www.wix.com/favicon.ico"
	},
	"jsonLd": [
		{
			"@context": "https://schema.org/",
			"@type": "HowTo",
			"name": "How to create a website for free",
			"description": "Follow these 7 simple steps to create a website today",
			"step": [
				{
					"@type": "HowToStep",
					"position": 1,
					"name": "Pick a platform",
					"text": "Sign up for a secure and reliable free website builder like Wix."
				},
				{
					"@type": "HowToStep",
					"position": 2,
					"name": "Plan out your website",
					"text": "Map out your goals, site structure and who your audience is."
				},
				{
					"@type": "HowToStep",
					"position": 3,
					"name": "Start creating",
					"text": "Choose from 900+ free templates or use the AI website builder."
				},
				{
					"@type": "HowToStep",
					"position": 4,
					"name": "Customize your website",
					"text": "Use the drag and drop editor and tailor your site to fit your brand."
				},
				{
					"@type": "HowToStep",
					"position": 5,
					"name": "Optimize for search engines",
					"text": "Increase your site’s visibility with a suite of built-in SEO tools."
				},
				{
					"@type": "HowToStep",
					"position": 6,
					"name": "Publish your website",
					"text": "Register and connect a custom domain name and go live."
				},
				{
					"@type": "HowToStep",
					"position": 7,
					"name": "Promote and drive traffic",
					"text": "Use built-in marketing tools to grow and expand your reach."
				}
			]
		},
		{
			"@context": "https://schema.org",
			"@type": "Organization",
			"name": "Wix.com",
			"legalName": "Wix.com Ltd",
			"url": "https://www.wix.com/",
			"logo": "https://static.wixstatic.com/media/9ab0d1_2ff5ca18550e405ea1844e52afaff120~mv2.jpg/v1/fill/w_333,h_154,al_c,lg_1,q_80,enc_auto/Wix%20logo%20white%20BG.jpg",
			"sameAs": [
				"https://www.facebook.com/wix",
				"https://twitter.com/wix",
				"https://www.instagram.com/wix",
				"https://www.youtube.com/user/Wix",
				"https://www.linkedin.com/company/wix-com/",
				"https://www.pinterest.com/wixcom/",
				"https://en.wikipedia.org/wiki/Wix.com"
			],
			"contactPoint": {
				"@type": "ContactPoint",
				"contactType": "Customer Support",
				"url": "https://www.wix.com/contact"
			}
		},
		{
			"@context": "https://schema.org",
			"@type": "FAQPage",
			"mainEntity": [
				{
					"@type": "Question",
					"name": "Is it easy to build a website?",
					"acceptedAnswer": {
						"@type": "Answer",
						"text": "Yes. Wix offers a few different ways to create your own free website, so you can choose the creation process that works best for you. Pick from 900+ designer-made templates, or use our AI website builder to create a business-ready site in no time using a conversational interface. You can also start from scratch using Wix’s drag and drop website builder. Whichever way you choose, you can always continue customizing in the Editor for total website design freedom."
					}
				},
				{
					"@type": "Question",
					"name": "How do I create a website?",
					"acceptedAnswer": {
						"@type": "Answer",
						"text": "1. Plan your website. First, think about the type of site you’re creating and your target audience. With that in mind, you can start mapping out the pages you want to incorporate like the “About” and “Contact” pages, perhaps a blog or a photo gallery, and a page for products or services. 2. Build with AI or choose a template. Chat with our AI website builder about your business and preferences to instantly get a fully functional and customizable website built for you. You can also start by choosing from a variety of templates, all professionally designed with the best site practices in mind. 3. Customize your website. Whether you start with AI or a template, you can use our intuitive drag and drop editor to customize your site to match your brand. Your site will also be optimized for mobile, but in the editor you have the option to make changes and customize your site’s mobile view. 4. Get a domain name. When trying to come up with the perfect domain name, you can use Wix to search and register available names, or connect an existing one to your new site. 5. Optimize for search engines. Use a suite of advanced SEO tools to help you optimize your site and increase organic traffic. 6. Publish and promote your website. Once you’re happy with your site, you’re ready to hit ‘publish’ and start gaining traffic. Now’s the time to promote your site with built-in marketing tools and streamline your customer management processes with a smart CRM system. By following these steps, you'll be able to build a powerful online presence that drives business growth."
					}
				},
				{
					"@type": "Question",
					"name": "How do I choose the best website builder?",
					"acceptedAnswer": {
						"@type": "Answer",
						"text": "The best website builder is one that will support your specific needs and business goals. As you research which builder is right for you, here are some important factors to consider: Ease of use: Look for a website maker with an intuitive interface. Using a drag and drop editor that includes AI tools can simplify the web design process and make it more efficient. Website templates: Look for a platform that offers a large selection of customizable templates tailored to a variety of industries and business types. Business solutions: Some website builders include built-in solutions to help you manage and grow your business seamlessly like eCommerce, scheduling, marketing tools and more. Security: It’s important that the website builder you choose offers reliable web hosting with data encryption to protect your and your users’ data. Customer support: Reliable support is crucial. Check what support options your website builder provides, and make sure you’ll be able to get the help you need 24/7. Wix fulfills all these requirements, giving you the tools you need to create a website and confidently run and grow your business all in one place."
					}
				},
				{
					"@type": "Question",
					"name": "How long does it take to build a website?",
					"acceptedAnswer": {
						"@type": "Answer",
						"text": "The amount of time it takes to build a website is now quicker than ever thanks to technological advancements in the field. In fact, you can build a functional and professional looking website in a matter of hours on a platform like Wix. To make the process even faster, you can chat with Wix’s AI website creator about the kind of site you want and you’ll have a draft of your website ready in minutes. Then you can customize it to make it your own with the drag and drop editor."
					}
				},
				{
					"@type": "Question",
					"name": "How much does it cost to build a website?",
					"acceptedAnswer": {
						"@type": "Answer",
						"text": "The cost of building a website varies depending on the features you need and whether you opt to build it yourself with a website builder or pay a developer to build it for you. On a website builder such as Wix, you can build as many websites as you want for free, however, you’ll have to upgrade to a Premium plan to connect a custom domain and get advanced business features. The cost of creating a website may be significantly higher if you don’t opt for an all-inclusive platform like Wix."
					}
				}
			]
		},
		{
			"@context": "https://schema.org",
			"@type": "VideoObject",
			"name": "Wix website builder",
			"description": "Have a conversation with our AI website builder to create your own personalized, business-ready site.",
			"thumbnailUrl": "https://static.wixstatic.com/media/0784b1_1c525438f02149c3b0489e0d789890f0f002.png",
			"uploadDate": "2024-09-24",
			"duration": "PT0M20S",
			"contentUrl": "https://video.wixstatic.com/video/0784b1_1c525438f02149c3b0489e0d789890f0/360p/mp4/file.mp4"
		}
	],
	"hreflangLinks": {
		"fr": "https://fr.wix.com",
		"uk": "https://uk.wix.com",
		"pt": "https://pt.wix.com",
		"cs": "https://cs.wix.com",
		"it": "https://it.wix.com",
		"vi": "https://vi.wix.com",
		"nl": "https://nl.wix.com",
		"ko": "https://ko.wix.com",
		"de": "https://de.wix.com",
		"ru": "https://ru.wix.com",
		"th": "https://th.wix.com",
		"id": "https://id.wix.com",
		"sv": "https://sv.wix.com",
		"tr": "https://tr.wix.com",
		"da": "https://da.wix.com",
		"en": "https://www.wix.com",
		"es": "https://es.wix.com",
		"ja": "https://ja.wix.com",
		"x-default": "https://www.wix.com",
		"no": "https://no.wix.com",
		"zh": "https://zh.wix.com",
		"pl": "https://pl.wix.com"
	},
	"socialLinks": {
		"facebook": "https://www.facebook.com/wix",
		"youtube": "https://www.youtube.com/user/Wix",
		"instagram": "https://www.instagram.com/wix",
		"tiktok": "https://www.tiktok.com/@wix",
		"pinterest": "https://www.pinterest.com/wixcom",
		"twitter": "https://twitter.com/wix",
		"linkedin": "https://www.linkedin.com/company/wix-com?trk=biz-companies-cym"
	},
	"detectedTechnologies": [
		"Wix.com Website Builder"
	],
	"h1": "Create a website without limits",
	"allH1s": [
		"Create a website without limits"
	],
	"allH2s": [
		"Create your site in minutes with our AI website builder",
		"Or choose a professionally designed template",
		"Customize to make it your own",
		"Add anything you need for your business as you go",
		"Make your website official with your own domain name",
		"Market your site from launch to scale",
		"Run your business from one dashboard",
		"Grow your website on a rock-solid foundation",
		"How to create a website for free",
		"Thriving with Wix",
		"Get inspired, gain new skills and see what’s trending",
		"Made on Wix",
		"We’re here for you 24/7",
		"Website builder FAQ",
		"Your vision. Your goals. Your website."
	],
	"wordCount": 9996
}
```

***

### 🧠 Features

- 🔍 Extracts full SEO metadata (title, description, keywords, social tags)
- 🤖 Parses **robots.txt** with detailed user-agent rules
- 🗺️ Finds all sitemap files for deeper crawling potential
- 📊 Counts visible words and extracts H1/H2 content
- 🧱 Detects frameworks and CMS technologies automatically
- ⚡ Supports **Fast Mode** to skip domain-level scans for large datasets
- 🛡️ Built with **Crawlee + Apify SDK** for reliability and scalability

***

### 💡 Use Cases

- **SEO Audits:** Evaluate site structure, metadata, and content depth
- **Competitive Research:** Identify frameworks and SEO strategies used by other sites
- **Data Enrichment:** Add structured metadata and robot rules to domain databases
- **Web Intelligence:** Analyze robots/sitemap behavior across industries

***

### 🧑‍💻 Developer Info

**Author:** [CodeScraper](mailto:codescraper011@gmail.com)

***

### 🏷️ Tags

`website-metadata-extractor` · `seo-analyzer` · `metadata-scraper` · `robots-txt-parser` · `sitemap-analyzer` · `jsonld-extractor` · `seo-research` · `web-intelligence` · `apify` · `crawlee`

# Actor input Schema

## `startUrls` (type: `array`):

A list of website URLs to analyze. Supported urls must start with http:// or https:// for example: https://example.com, http://example.org

## `disableDomainAnalysis` (type: `boolean`):

If true, the actor will skip fetching and analyzing robots.txt and sitemap.xml. This is faster but provides less data.

## Actor input object example

```json
{
  "startUrls": [
    "https://apify.com",
    "https://www.google.com",
    "https://www.youtube.com",
    "https://www.shopify.com"
  ],
  "disableDomainAnalysis": false
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://apify.com",
        "https://www.google.com",
        "https://www.youtube.com",
        "https://www.shopify.com"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("codescraper/website-metadata-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [
        "https://apify.com",
        "https://www.google.com",
        "https://www.youtube.com",
        "https://www.shopify.com",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("codescraper/website-metadata-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://apify.com",
    "https://www.google.com",
    "https://www.youtube.com",
    "https://www.shopify.com"
  ]
}' |
apify call codescraper/website-metadata-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=codescraper/website-metadata-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Website Metadata Extractor(sitemap, socialLinks, robotsTxt)",
        "description": "A very fast metadata extractor to get all meta tags, robots.txt, sitemaps, social links, H1s, word count, and JSON-LD data. Also provides technology detection for a full analysis. Get your data fast for just $3/month.",
        "version": "1.0",
        "x-build-id": "MZWXbk4DOJr8ljhDQ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/codescraper~website-metadata-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-codescraper-website-metadata-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/codescraper~website-metadata-extractor/runs": {
            "post": {
                "operationId": "runs-sync-codescraper-website-metadata-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/codescraper~website-metadata-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-codescraper-website-metadata-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "A list of website URLs to analyze. Supported urls must start with http:// or https:// for example: https://example.com, http://example.org",
                        "items": {
                            "type": "string"
                        }
                    },
                    "disableDomainAnalysis": {
                        "title": "Disable Domain Analysis (Fast Mode)",
                        "type": "boolean",
                        "description": "If true, the actor will skip fetching and analyzing robots.txt and sitemap.xml. This is faster but provides less data.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
