# Website to RSS Feed Generator (`junipr/website-to-rss`) Actor

Convert any website to RSS 2.0. Smart content detection finds articles automatically. CSS selectors for custom targeting. Configurable field mapping. Schedule for auto updates. Output as valid RSS XML.

- **URL**: https://apify.com/junipr/website-to-rss.md
- **Developed by:** [junipr](https://apify.com/junipr) (community)
- **Categories:** Automation, Developer tools
- **Stats:** 8 total users, 0 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.60 / 1,000 feed item extracteds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Website to RSS Feed Generator

Convert any website into a fully compliant RSS 2.0, Atom 1.0, or JSON Feed. This actor crawls a target URL, automatically detects article patterns on the page, and generates a structured feed you can subscribe to in any feed reader. Perfect for monitoring websites that don't offer their own RSS feeds, integrating blog updates into automation workflows, or aggregating content from multiple sources.

### Features

- **Auto-detection** of article patterns — works out of the box on most blogs, news sites, and content pages without any configuration
- **Custom CSS selectors** for precise extraction when auto-detection isn't enough
- **JavaScript rendering** via Playwright for single-page applications and dynamically loaded content
- **Multiple output formats** — RSS 2.0, Atom 1.0, and JSON Feed 1.1
- **Markdown conversion** — article content is available as both HTML and clean Markdown
- **Relative date parsing** — handles "2 hours ago", "yesterday", ISO 8601, and dozens of other date formats

### How It Works

1. The actor loads the target URL (using Cheerio for static sites, or Playwright for JS-rendered sites)
2. If no custom selectors are provided, it scans the page for common article patterns (`<article>`, `.post`, `.entry`, `[class*="article"]`, etc.)
3. Within each detected article container, it extracts the title, link, date, content, author, and thumbnail image
4. The extracted items are pushed to the dataset and assembled into a feed file stored in the Key-Value Store

### Input Configuration

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `url` | string | *required* | Website URL to convert into an RSS feed |
| `feedTitle` | string | auto | Custom feed title (uses site title if empty) |
| `feedDescription` | string | auto | Custom feed description |
| `maxItems` | integer | 20 | Maximum items to include (1-100) |
| `articleSelector` | string | auto | CSS selector for article containers |
| `titleSelector` | string | auto | CSS selector for titles within articles |
| `linkSelector` | string | auto | CSS selector for links within articles |
| `dateSelector` | string | auto | CSS selector for dates within articles |
| `contentSelector` | string | auto | CSS selector for content/summary |
| `authorSelector` | string | auto | CSS selector for author names |
| `imageSelector` | string | auto | CSS selector for thumbnail images |
| `renderJs` | boolean | false | Enable Playwright for JS-rendered sites |
| `outputFormat` | enum | rss2 | Output format: rss2, atom, or json |

### Output

#### Key-Value Store

The generated feed is saved under the key `feed` in the default Key-Value Store. Access it via the Apify API or download it directly from the run's storage tab.

#### Dataset

Each extracted article is stored as a structured record:

```json
{
  "title": "How to Build a Web Scraper",
  "link": "https://blog.example.com/web-scraper-guide",
  "pubDate": "2026-03-10T14:30:00.000Z",
  "author": "Jane Smith",
  "content": "<p>In this tutorial, we'll build a web scraper...</p>",
  "contentMarkdown": "In this tutorial, we'll build a web scraper...",
  "thumbnail": "https://blog.example.com/images/scraper-thumb.jpg",
  "guid": "https://blog.example.com/web-scraper-guide"
}
````

### Use Cases

- **Content monitoring** — Track updates on competitor blogs, industry news sites, or government portals that lack RSS feeds
- **Automation** — Pipe new articles into Slack, email digests, or databases via integrations with Make, Zapier, or n8n
- **Research** — Aggregate content from multiple sources into a single feed reader for efficient review
- **Archiving** — Store structured snapshots of published content with dates, authors, and full text

### Tips for Best Results

- Start with the defaults — auto-detection works on most sites without any selectors
- If articles aren't detected, inspect the page HTML and provide a custom `articleSelector`
- Enable `renderJs` only when needed — it uses more compute but is required for SPAs
- Use the `maxItems` parameter to control costs on pages with many articles
- The feed file in KV Store can be served directly as an RSS endpoint when combined with a webhook or scheduled run

### FAQ

#### What if auto-detection fails?

Open the target page in your browser, right-click an article, select "Inspect," and note the CSS selector for the article container. Set that as `articleSelector` in the input. You can similarly override any sub-selector (title, link, date, etc.).

#### Can I schedule this to run periodically?

Yes. Set up a scheduled run on Apify and the feed in the Key-Value Store will be updated on each run. You can point your feed reader at the KV Store URL for a self-updating RSS feed.

#### How does pricing work?

This actor uses Pay-Per-Event pricing at $3.00 per 1,000 feed items extracted. You only pay for successfully extracted articles, never for failed requests or empty pages.

# Actor input Schema

## `url` (type: `string`):

The website URL to convert into an RSS feed. This should be a page that lists articles, blog posts, or news items.

## `feedTitle` (type: `string`):

Custom title for the generated RSS feed. If not set, the website's title is used automatically.

## `feedDescription` (type: `string`):

Custom description for the generated RSS feed. If not set, the website's meta description is used.

## `maxItems` (type: `integer`):

Maximum number of articles/items to include in the feed.

## `articleSelector` (type: `string`):

CSS selector for article containers. Leave empty to auto-detect common patterns like <article>, .post, .entry, etc.

## `titleSelector` (type: `string`):

CSS selector for article titles within each article container. Auto-detected if empty (looks for h1-h3, .title, .headline).

## `linkSelector` (type: `string`):

CSS selector for article links within each article container. Auto-detected if empty (first <a> tag).

## `dateSelector` (type: `string`):

CSS selector for article dates within each article container. Auto-detected if empty (looks for <time>, .date, \[datetime]).

## `contentSelector` (type: `string`):

CSS selector for article content or summary within each article container. Auto-detected if empty.

## `authorSelector` (type: `string`):

CSS selector for article author within each article container. Auto-detected if empty (looks for .author, \[rel=author]).

## `imageSelector` (type: `string`):

CSS selector for article thumbnail images within each article container. Auto-detected if empty (first <img>).

## `renderJs` (type: `boolean`):

Use a headless browser to render JavaScript before extraction. Enable for SPAs and dynamic websites. Disable for static sites to save compute.

## `outputFormat` (type: `string`):

Format of the generated feed file. RSS 2.0 is the most widely supported.

## `proxyConfiguration` (type: `object`):

Proxy settings for requests. Defaults to Apify datacenter proxies.

## Actor input object example

```json
{
  "url": "https://blog.apify.com",
  "maxItems": 20,
  "renderJs": false,
  "outputFormat": "rss2",
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `results` (type: `string`):

Extracted feed items with titles, URLs, dates, summaries, and content from the target website.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://blog.apify.com",
    "maxItems": 20,
    "outputFormat": "rss2",
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("junipr/website-to-rss").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "url": "https://blog.apify.com",
    "maxItems": 20,
    "outputFormat": "rss2",
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("junipr/website-to-rss").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://blog.apify.com",
  "maxItems": 20,
  "outputFormat": "rss2",
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call junipr/website-to-rss --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=junipr/website-to-rss",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Website to RSS Feed Generator",
        "description": "Convert any website to RSS 2.0. Smart content detection finds articles automatically. CSS selectors for custom targeting. Configurable field mapping. Schedule for auto updates. Output as valid RSS XML.",
        "version": "1.0",
        "x-build-id": "QNbVtq72m0ju4j48H"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/junipr~website-to-rss/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-junipr-website-to-rss",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/junipr~website-to-rss/runs": {
            "post": {
                "operationId": "runs-sync-junipr-website-to-rss",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/junipr~website-to-rss/run-sync": {
            "post": {
                "operationId": "run-sync-junipr-website-to-rss",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "url": {
                        "title": "Website URL",
                        "type": "string",
                        "description": "The website URL to convert into an RSS feed. This should be a page that lists articles, blog posts, or news items.",
                        "default": "https://blog.apify.com"
                    },
                    "feedTitle": {
                        "title": "Feed Title",
                        "type": "string",
                        "description": "Custom title for the generated RSS feed. If not set, the website's title is used automatically."
                    },
                    "feedDescription": {
                        "title": "Feed Description",
                        "type": "string",
                        "description": "Custom description for the generated RSS feed. If not set, the website's meta description is used."
                    },
                    "maxItems": {
                        "title": "Max Feed Items",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of articles/items to include in the feed.",
                        "default": 20
                    },
                    "articleSelector": {
                        "title": "Article Container Selector",
                        "type": "string",
                        "description": "CSS selector for article containers. Leave empty to auto-detect common patterns like <article>, .post, .entry, etc."
                    },
                    "titleSelector": {
                        "title": "Title Selector",
                        "type": "string",
                        "description": "CSS selector for article titles within each article container. Auto-detected if empty (looks for h1-h3, .title, .headline)."
                    },
                    "linkSelector": {
                        "title": "Link Selector",
                        "type": "string",
                        "description": "CSS selector for article links within each article container. Auto-detected if empty (first <a> tag)."
                    },
                    "dateSelector": {
                        "title": "Date Selector",
                        "type": "string",
                        "description": "CSS selector for article dates within each article container. Auto-detected if empty (looks for <time>, .date, [datetime])."
                    },
                    "contentSelector": {
                        "title": "Content/Summary Selector",
                        "type": "string",
                        "description": "CSS selector for article content or summary within each article container. Auto-detected if empty."
                    },
                    "authorSelector": {
                        "title": "Author Selector",
                        "type": "string",
                        "description": "CSS selector for article author within each article container. Auto-detected if empty (looks for .author, [rel=author])."
                    },
                    "imageSelector": {
                        "title": "Image Selector",
                        "type": "string",
                        "description": "CSS selector for article thumbnail images within each article container. Auto-detected if empty (first <img>)."
                    },
                    "renderJs": {
                        "title": "Render JavaScript",
                        "type": "boolean",
                        "description": "Use a headless browser to render JavaScript before extraction. Enable for SPAs and dynamic websites. Disable for static sites to save compute.",
                        "default": false
                    },
                    "outputFormat": {
                        "title": "Output Format",
                        "enum": [
                            "rss2",
                            "atom",
                            "json"
                        ],
                        "type": "string",
                        "description": "Format of the generated feed file. RSS 2.0 is the most widely supported.",
                        "default": "rss2"
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings for requests. Defaults to Apify datacenter proxies.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
