# Noindex Directive Validator (`zerobreak/noindex-directive-validator`) Actor

Noindex checker that scans URLs for meta robots and X-Robots-Tag headers, so SEO teams can find pages accidentally blocked from indexing before they drop out of search results.

- **URL**: https://apify.com/zerobreak/noindex-directive-validator.md
- **Developed by:** [ZeroBreak](https://apify.com/zerobreak) (community)
- **Categories:** SEO tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$2.99/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Noindex Directive Validator: Check Any URL for Noindex Tags and Headers

The noindex directive validator checks URLs for indexing blocks that might be hiding pages from Google. Feed it a list of URLs and it reads the meta robots tag and X-Robots-Tag HTTP header on each page, then tells you which ones have noindex set, along with the HTTP status code and the raw directive content.

Most useful after a site migration, CMS update, or new deployment. A stray noindex on a production page can silently drop it from search results. Catching them manually across dozens of pages is slow. This does it in bulk.

### Use cases

- **SEO audits**: verify that no important pages are accidentally blocked from search indexing across an entire site section
- **Post-migration checks**: confirm that noindex tags used on staging have been removed before or after launch
- **CMS validation**: catch cases where a CMS update or plugin added noindex to pages it should not have
- **Developer QA**: run a quick crawlability check on a list of pages before publishing
- **Ongoing monitoring**: schedule regular runs to catch noindex regressions before they affect rankings

### What data does this actor extract?

Each URL in the dataset includes:

```json
{
    "url": "https://apify.com/about",
    "finalUrl": "https://apify.com/about",
    "httpStatus": 200,
    "noindex": false,
    "noindexInMetaRobots": false,
    "noindexInXRobotsTag": false,
    "metaRobotsContent": "index, follow",
    "xRobotsTagContent": "",
    "pageTitle": "About Apify",
    "checkedAt": "2025-06-15T10:23:45.123456+00:00",
    "error": ""
}
````

| Field | Type | Description |
|-------|------|-------------|
| `url` | string | The original URL submitted for checking |
| `finalUrl` | string | The URL after following any redirects |
| `httpStatus` | integer | HTTP status code (200, 301, 404, etc.) |
| `noindex` | boolean | True if noindex was found anywhere on the page |
| `noindexInMetaRobots` | boolean | True if noindex is in a `<meta name="robots">` or `<meta name="googlebot">` tag |
| `noindexInXRobotsTag` | boolean | True if noindex is in the X-Robots-Tag HTTP response header |
| `metaRobotsContent` | string | Full content of the meta robots tag, if present |
| `xRobotsTagContent` | string | Full value of the X-Robots-Tag header, if present |
| `pageTitle` | string | Page title from the `<title>` element |
| `checkedAt` | string | ISO 8601 timestamp of when the check ran |
| `error` | string | Error message if the request failed; empty on success |

### Input

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `url` | string | | Single URL to check |
| `urls` | array | | List of URLs to check, one per line |
| `maxUrls` | integer | 100 | Maximum number of URLs to process (up to 1000) |
| `requestTimeoutSecs` | integer | 30 | Timeout per request in seconds |
| `proxyConfiguration` | object | Datacenter (Anywhere) | Proxy type and location to use for requests. Optional. |

#### Example input

```json
{
    "urls": [
        "https://apify.com",
        "https://apify.com/about",
        "https://apify.com/pricing"
    ],
    "maxUrls": 100,
    "requestTimeoutSecs": 30,
    "proxyConfiguration": { "useApifyProxy": true }
}
```

### How it works

1. Takes the submitted URLs, deduplicates them, and normalizes missing schemes to `https://`
2. Fetches each URL using an HTTP client that follows redirects
3. Reads the `X-Robots-Tag` response header and checks it for `noindex` or `none`
4. Parses the HTML and looks for `<meta name="robots">` and `<meta name="googlebot">` tags with `noindex` or `none` in their content
5. Pushes a result record per URL with the noindex status, raw directive values, HTTP status code, and page title

### FAQ

**Does this check the robots.txt file?**
No. This actor checks page-level noindex directives only: the meta robots tag and X-Robots-Tag header. Robots.txt controls crawling access, not indexing, and is a separate concern.

**What counts as noindex?**
Both `noindex` and `none` directives (as defined by Google) trigger the noindex flag. `none` means noindex and nofollow combined.

**Does it check Google-specific noindex tags?**
Yes. The actor checks both `<meta name="robots">` and `<meta name="googlebot">` tags.

**What happens if a URL redirects?**
The actor follows redirects and checks the final destination page. Both the original URL and the final URL are recorded in the output.

**How many URLs can I check per run?**
Up to 1,000 URLs per run. Set the `maxUrls` input to control the limit.

**Can I run this on a schedule?**
Yes. Use Apify's scheduling feature to run the actor automatically at regular intervals to catch noindex regressions over time.

### Integrations

Connect Noindex Directive Validator with other apps and services using [Apify integrations](https://apify.com/integrations). You can integrate with Make, Zapier, Slack, Airbyte, GitHub, Google Sheets, Google Drive, and many more. You can also use [webhooks](https://docs.apify.com/integrations/webhooks) to trigger actions whenever results are available.

Run the noindex checker before a site launch to confirm every page you want indexed is actually indexable.

# Actor input Schema

## `url` (type: `string`):

Single URL to check for noindex directives.

## `urls` (type: `array`):

List of URLs to check. Provide one URL per line. Combined with the single URL field above.

## `maxUrls` (type: `integer`):

Maximum number of URLs to process in a single run. Capped at 1000.

## `requestTimeoutSecs` (type: `integer`):

Timeout in seconds for each individual HTTP request.

## `proxyConfiguration` (type: `object`):

Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect.

## Actor input object example

```json
{
  "url": "https://apify.com/about",
  "urls": [
    "https://apify.com",
    "https://apify.com/about"
  ],
  "maxUrls": 100,
  "requestTimeoutSecs": 30,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://apify.com/about",
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("zerobreak/noindex-directive-validator").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "url": "https://apify.com/about",
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("zerobreak/noindex-directive-validator").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://apify.com/about",
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call zerobreak/noindex-directive-validator --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=zerobreak/noindex-directive-validator",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Noindex Directive Validator",
        "description": "Noindex checker that scans URLs for meta robots and X-Robots-Tag headers, so SEO teams can find pages accidentally blocked from indexing before they drop out of search results.",
        "version": "0.0",
        "x-build-id": "keSKGzIVqrsnQxLK9"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/zerobreak~noindex-directive-validator/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-zerobreak-noindex-directive-validator",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/zerobreak~noindex-directive-validator/runs": {
            "post": {
                "operationId": "runs-sync-zerobreak-noindex-directive-validator",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/zerobreak~noindex-directive-validator/run-sync": {
            "post": {
                "operationId": "run-sync-zerobreak-noindex-directive-validator",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "url": {
                        "title": "URL",
                        "type": "string",
                        "description": "Single URL to check for noindex directives."
                    },
                    "urls": {
                        "title": "URLs",
                        "type": "array",
                        "description": "List of URLs to check. Provide one URL per line. Combined with the single URL field above.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxUrls": {
                        "title": "Max URLs",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of URLs to process in a single run. Capped at 1000.",
                        "default": 100
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Timeout in seconds for each individual HTTP request.",
                        "default": 30
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
