# Simple SEO Data Extractor (`onescales/simple-seo-data-extractor`) Actor

Grab SEO data from any webpage / URL and export the URL, Title Tag, Meta Description, Meta Keywords, Status Code, H1, H2, Canonical Tag and Meta Robots easily. Run the scraper for 1-100,000 pages. Run one time or on schedule or via API and Get an SEO Report for Any Site.

- **URL**: https://apify.com/onescales/simple-seo-data-extractor.md
- **Developed by:** [One Scales](https://apify.com/onescales) (community)
- **Categories:** SEO tools, E-commerce, Developer tools
- **Stats:** 465 total users, 2 monthly users, 100.0% runs succeeded, 6 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $10.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

Grab SEO data from any webpage / URL and export the URL, Title Tag, Meta Description, Meta Keywords, Status Code, Canonical Tag and Meta Robots easily. Run the scraper for 1-100,000 pages. Run one time or on schedule or via API.

### Use Cases
- **SEO Monitoring**: Track SEO data for your websites or competitors over time.
- **Content Analysis**: Analyze meta tags to optimize webpage content for search engines.
- **SEO Audits**: Collect data for comprehensive SEO audits across multiple pages.
- **Competitor Analysis**: Track SEO data for your competitors over time.

### How to Use
1. **Input URLs**: In the Apify frontend UI, go to the "Input" tab and enter the list of URLs you want to scrape. By default, it’s prefilled with "https://onescales.com/" for your testing. Please change it to your URL's.
2. **Adjust Settings**: Optionally, modify the `timeout` (default: 600 seconds) or note the `memory` reference (default: 512MB). Set actual memory in the run settings if needed.
3. **Run the Scraper**: Click "Start" to begin scraping.
4. **View Results**: Once complete, check the "Dataset" tab for the extracted SEO data.

**Note**: For large batches (e.g., 10,000 pages), consider increasing memory in the run settings beyond 512MB for optimal performance as well as increasing the timeout.

### Output Example
The scraper outputs data to a dataset with these fields:
- `url`: The scraped webpage URL
- `title`: The page’s title tag
- `metaDescription`: The meta description tag content
- `metaKeywords`: The meta keywords tag content
- `metaRobots`: The meta robots tag content
- `canonicalTag`: The canonical tag content
- `numLinks`: Number of a href links on page
- `statusCode`: The status code of the page (i.e. 200, 404, 301)
- `h1Tags`: Lists all h1 tags
- `h2Tags`: Lists all h2 tags
- `h3Tags`: Lists all h2 tags

**Sample Output:**
```json
[
  {
    "url": "https://www.rossstores.com/",
    "statusCode": 200,
    "titleTag": "Ross Dress For Less",
    "metaDescription": "Ross Dress for Less offers the best bargains on the latest trends in clothing, shoes, home decor and more! Find your store today!",
    "metaKeywords": "",
    "metaRobots": "index, follow, max-image-preview:large, max-snippet:-1, max-video-preview:-1",
    "numLinks": 38,
    "h1Tags": "Ross Stores Dress For Less",
    "h2Tags": "Summer Fun Starts at Ross!\nSummer Fun Starts at Ross!\nWarm Weather Style!\nWarm Weather Style!\nSave 10% today!\nSave 10% today!\nWe’re Hiring!\nWe’re Hiring!\nThe perfect gift for any occasion!\nThe perfect gift for any occasion!\nFound In-Store: Pop-tastic Fun!\nSign Up for Emails\nSign Up for Emails\nOur Company\nSupport\nCredit Card\nPrivacy & Terms",
    "h3Tags": "View More",
    "canonicalTag": "https://www.rossstores.com/"
  } 
]
````

### How to Export

Exporting your data is simple and seamless:

- Access Results: After running the scraper, view the collected data in Apify’s interface.
- Select Export Option: Choose the CSV export option to download your data.
- Open in Tools: Import the CSV file into Excel, Google Sheets, or your preferred analysis tool to visualize and explore the data.
- Share or Store: Save the file for future reference or share it with your team for collaborative analysis.

### Need Help

Have questions or need additional features? We’re here to support you!Just fill out the form at https://docs.google.com/forms/d/e/1FAIpQLSfsKyzZ3nRED7mML47I4LAfNh\_mBwkuFMp1FgYYJ4AkDRgaRw/viewform?usp=dialog and we’ll try our best to help as quick as possible.

### Related Keywords

SEO, SEO Report, search engine optimization, h1, h2, h3, h4, h5, title tag, meta tag, meta description, SEO tool, actor, AI, API, apify, at scale, auditor, automated, automation, batch, bulk, checker, converter, crawler, cron, CSV, dataset, detector, downloader, Excel, export, exporter, extractor, fetcher, finder, free tool, generator, Google Sheets, HTML, ifttt, instant, JSON, lookup, make, make.com, maker, mass, MCP, monitor, n8n, no-code, no API key required, parser, PDF, pipeline, report, scanner, schedule, scheduled, scraper, spreadsheet, tool, validator, verifier, webhook, workflow, XML, zapier

# Actor input Schema

## `urls` (type: `array`):

An array of URLs to scrape. Example: \['https://example.com/']

## `timeout` (type: `integer`):

Maximum time in seconds to wait for each page to load.

## `memory` (type: `integer`):

Maximum memory in MB the actor can use.

## Actor input object example

```json
{
  "urls": [
    "https://onescales.com/"
  ],
  "timeout": 1200,
  "memory": 512
}
```

# Actor output Schema

## `extractedData` (type: `string`):

All extracted SEO data from the scraped URLs

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "urls": [
        "https://onescales.com/"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("onescales/simple-seo-data-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "urls": ["https://onescales.com/"] }

# Run the Actor and wait for it to finish
run = client.actor("onescales/simple-seo-data-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "urls": [
    "https://onescales.com/"
  ]
}' |
apify call onescales/simple-seo-data-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=onescales/simple-seo-data-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Simple SEO Data Extractor",
        "description": "Grab SEO data from any webpage / URL and export the URL, Title Tag, Meta Description, Meta Keywords, Status Code, H1, H2, Canonical Tag and Meta Robots easily. Run the scraper for 1-100,000 pages. Run one time or on schedule or via API and Get an SEO Report for Any Site.",
        "version": "0.3",
        "x-build-id": "ruhtsMVaXqgTbrIxa"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/onescales~simple-seo-data-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-onescales-simple-seo-data-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/onescales~simple-seo-data-extractor/runs": {
            "post": {
                "operationId": "runs-sync-onescales-simple-seo-data-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/onescales~simple-seo-data-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-onescales-simple-seo-data-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "urls"
                ],
                "properties": {
                    "urls": {
                        "title": "URLs to scrape",
                        "type": "array",
                        "description": "An array of URLs to scrape. Example: ['https://example.com/']",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "timeout": {
                        "title": "Timeout for each page (seconds)",
                        "minimum": 60,
                        "type": "integer",
                        "description": "Maximum time in seconds to wait for each page to load.",
                        "default": 1200
                    },
                    "memory": {
                        "title": "Memory limit (MB)",
                        "minimum": 256,
                        "type": "integer",
                        "description": "Maximum memory in MB the actor can use.",
                        "default": 512
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
