# Canonical Tag Validator (`zerobreak/canonical-tag-validator`) Actor

Canonical tag validator that checks any webpage for missing, duplicate, or misconfigured canonical URLs — so SEO teams and developers can fix canonical issues before they damage search rankings.

- **URL**: https://apify.com/zerobreak/canonical-tag-validator.md
- **Developed by:** [ZeroBreak](https://apify.com/zerobreak) (community)
- **Categories:** Developer tools, SEO tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$4.99/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Canonical Tag Validator — Check & Fix Canonical URL Issues on Any Website

Canonical tag validator that checks any webpage for missing, duplicate, or misconfigured canonical tags. Instantly audit hundreds of URLs and get a detailed report of canonical tag issues that could be hurting your search rankings. Built-in cost safeguards prevent accidental overuse — set URL limits and timeouts to stay in control of your Apify spending.

### Use Cases

- **SEO auditing** — Automatically detect missing or broken canonical tags across your entire site before they damage your search rankings
- **Site migration validation** — Verify that canonical tags are correctly pointing to the new domain after a website migration
- **Duplicate content prevention** — Find pages with mismatched or cross-domain canonical URLs that could cause duplicate content penalties
- **Technical SEO monitoring** — Regularly validate canonical tag health across key landing pages and product URLs
- **QA testing** — Check canonical tags on staging environments before publishing new pages to production
- **Agency reporting** — Batch-validate canonical tags for multiple client sites and export results for review

### Input

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `url` | string | — | A single URL to validate the canonical tag for |
| `urls` | string[] | — | List of URLs to validate (enter one per line in the UI) |
| `maxUrls` | integer | `100` | Maximum URLs to process per run (hard cap: 1000). Prevents accidental overuse |
| `timeoutSecs` | integer | `300` | Overall run timeout in seconds. Actor stops and saves results when reached |
| `requestTimeoutSecs` | integer | `30` | Per-request timeout. Slow pages are skipped after this limit |
| `followRedirects` | boolean | `true` | Follow HTTP redirects to detect canonical mismatches caused by redirect chains |

#### Example Input

```json
{
    "urls": [
        "https://example.com",
        "https://example.com/about",
        "https://example.com/products"
    ],
    "maxUrls": 100,
    "timeoutSecs": 300
}
````

### Output

The actor stores results in a dataset. Each entry contains all 13 fields:

```json
{
    "url": "https://apify.com",
    "finalUrl": "https://apify.com/",
    "statusCode": 200,
    "hasCanonicalTag": true,
    "canonicalUrl": "https://apify.com",
    "hasCanonicalHeader": false,
    "canonicalHeaderUrl": null,
    "isValid": true,
    "isSelfReferencing": true,
    "issues": [],
    "pageTitle": "Apify: Full-stack web scraping and data extraction platform",
    "checkedAt": "2026-02-26T09:40:12.345678+00:00",
    "error": null
}
```

| Field | Type | Description |
|-------|------|-------------|
| `url` | string | The original URL that was checked |
| `finalUrl` | string | Final URL after following redirects |
| `statusCode` | integer | HTTP status code of the response |
| `hasCanonicalTag` | boolean | Whether an HTML `<link rel="canonical">` tag was found |
| `canonicalUrl` | string | The canonical URL from the HTML tag |
| `hasCanonicalHeader` | boolean | Whether a `Link` HTTP header with `rel="canonical"` was found |
| `canonicalHeaderUrl` | string | The canonical URL from the HTTP header |
| `isValid` | boolean | `true` if no issues were found with the canonical setup |
| `isSelfReferencing` | boolean | Whether the canonical tag points back to the same page |
| `issues` | array | List of specific problems found (empty if valid) |
| `pageTitle` | string | The page's `<title>` tag content |
| `checkedAt` | string | ISO 8601 timestamp of when the check ran |
| `error` | string | Error message if the URL could not be fetched |

### What Issues Does This Actor Detect?

The canonical tag validator checks for these common SEO problems:

1. **Missing canonical tag** — No `<link rel="canonical">` or `Link` header found
2. **Multiple canonical tags** — More than one canonical tag on a single page
3. **Empty href** — Canonical tag exists but the `href` attribute is blank
4. **Relative canonical URL** — Canonical should always be a fully qualified absolute URL
5. **Protocol mismatch** — Page served over HTTPS but canonical points to HTTP (or vice versa)
6. **Cross-domain canonical** — Canonical points to a different domain (flagged for verification)
7. **Tag vs header conflict** — HTML canonical tag and HTTP `Link` header specify different URLs

### How It Works

1. Collects URLs from the `url` and `urls` input fields, deduplicates them, and caps at `maxUrls`
2. Fetches each page with a realistic browser User-Agent, optionally following redirects
3. Parses the HTML for `<link rel="canonical">` tags and checks HTTP `Link` headers
4. Validates the canonical URL for correctness (absolute URL, matching protocol, same domain)
5. Reports issues found and pushes structured results to the Apify dataset
6. Stops automatically when `timeoutSecs` is reached, saving all results collected so far

### Cost Safeguards

This actor includes built-in protections against accidental overuse:

- **URL cap** — `maxUrls` limits how many pages are processed per run (default: 100, max: 1000)
- **Overall timeout** — `timeoutSecs` stops the actor after a set duration (default: 5 minutes)
- **Per-request timeout** — Slow or unresponsive pages are skipped instead of blocking the run
- **Progress reporting** — Real-time status messages show exactly which URL is being checked

### FAQ

**Can this actor check canonical tags on JavaScript-rendered pages?**
This actor fetches raw HTML, which works for the vast majority of websites. Most canonical tags are in the initial HTML `<head>` and do not require JavaScript rendering. If your site injects canonical tags via client-side JavaScript, the actor may not detect them.

**How many URLs can I check in one run?**
Up to 1000 URLs per run (controlled by the `maxUrls` setting). The default is 100. For larger audits, run the actor multiple times with different URL batches.

**Will this actor follow redirects?**
Yes, by default. This helps detect canonical mismatches caused by redirect chains. You can disable this with the `followRedirects` option.

**What happens if the actor times out?**
All results collected before the timeout are saved to the dataset. You'll see a warning in the logs indicating how many URLs were processed before the timeout was reached.

**Does this validate that the canonical URL actually exists?**
The actor validates the canonical tag's structure and correctness (absolute URL, protocol, domain). It does not make a second request to verify the canonical URL resolves — that would double the number of HTTP requests and increase costs.

### Integrations

Connect Canonical Tag Validator with other apps and services using [Apify integrations](https://apify.com/integrations). You can integrate with Make, Zapier, Slack, Airbyte, GitHub, Google Sheets, Google Drive, and many more. You can also use [webhooks](https://docs.apify.com/integrations/webhooks) to trigger actions whenever results are available.

# Actor input Schema

## `url` (type: `string`):

A single URL to validate the canonical tag for.

## `urls` (type: `array`):

List of URLs to validate canonical tags for. Enter one URL per line.

## `maxUrls` (type: `integer`):

Maximum number of URLs to process in a single run. Prevents accidental overuse and keeps costs low.

## `timeoutSecs` (type: `integer`):

Maximum total run time in seconds. The actor will stop and save results collected so far when this limit is reached. Default is 300 seconds (5 minutes).

## `requestTimeoutSecs` (type: `integer`):

Maximum time to wait for a single page to respond. Slow pages will be skipped after this timeout.

## `followRedirects` (type: `boolean`):

Whether to follow HTTP redirects when fetching pages. Useful for detecting canonical mismatches caused by redirects.

## Actor input object example

```json
{
  "url": "https://example.com",
  "maxUrls": 100,
  "timeoutSecs": 300,
  "requestTimeoutSecs": 30,
  "followRedirects": true
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://example.com"
};

// Run the Actor and wait for it to finish
const run = await client.actor("zerobreak/canonical-tag-validator").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "url": "https://example.com" }

# Run the Actor and wait for it to finish
run = client.actor("zerobreak/canonical-tag-validator").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://example.com"
}' |
apify call zerobreak/canonical-tag-validator --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=zerobreak/canonical-tag-validator",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Canonical Tag Validator",
        "description": "Canonical tag validator that checks any webpage for missing, duplicate, or misconfigured canonical URLs — so SEO teams and developers can fix canonical issues before they damage search rankings.",
        "version": "0.0",
        "x-build-id": "W9K02gtFCLd7gURYx"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/zerobreak~canonical-tag-validator/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-zerobreak-canonical-tag-validator",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/zerobreak~canonical-tag-validator/runs": {
            "post": {
                "operationId": "runs-sync-zerobreak-canonical-tag-validator",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/zerobreak~canonical-tag-validator/run-sync": {
            "post": {
                "operationId": "run-sync-zerobreak-canonical-tag-validator",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "url": {
                        "title": "URL",
                        "type": "string",
                        "description": "A single URL to validate the canonical tag for."
                    },
                    "urls": {
                        "title": "URLs",
                        "type": "array",
                        "description": "List of URLs to validate canonical tags for. Enter one URL per line.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxUrls": {
                        "title": "Max URLs",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of URLs to process in a single run. Prevents accidental overuse and keeps costs low.",
                        "default": 100
                    },
                    "timeoutSecs": {
                        "title": "Overall Timeout (seconds)",
                        "minimum": 30,
                        "maximum": 3600,
                        "type": "integer",
                        "description": "Maximum total run time in seconds. The actor will stop and save results collected so far when this limit is reached. Default is 300 seconds (5 minutes).",
                        "default": 300
                    },
                    "requestTimeoutSecs": {
                        "title": "Per-Request Timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Maximum time to wait for a single page to respond. Slow pages will be skipped after this timeout.",
                        "default": 30
                    },
                    "followRedirects": {
                        "title": "Follow Redirects",
                        "type": "boolean",
                        "description": "Whether to follow HTTP redirects when fetching pages. Useful for detecting canonical mismatches caused by redirects.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
