# VA Rotherham Jobs Scraper (`memo23/varotherham-scraper`) Actor

Scrape the varotherham.org.uk South Yorkshire voluntary-sector job board (Wix CMS). One HTTP request, every job inline: title, employer, location, closing date. Rotherham / Barnsley / Doncaster / Sheffield charities. JSON or CSV out, billed per result.

- **URL**: https://apify.com/memo23/varotherham-scraper.md
- **Developed by:** [Muhamed Didovic](https://apify.com/memo23) (community)
- **Categories:** Jobs, Agents, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.99 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## VA Rotherham Jobs Scraper

**Scrape the varotherham.org.uk South Yorkshire voluntary-sector job board.** One HTTP request to a Wix-hosted CMS Collection returns every live vacancy inline: title, employer, location, closing date, and URL. JSON or CSV out, no compute charge per run, just per result.

#### How it works

![How VA Rotherham Scraper works](https://raw.githubusercontent.com/muhamed-didovic/muhamed-didovic.github.io/main/assets/how-it-works-varotherham.png)

#### ✨ Why use this scraper?

Voluntary Action Rotherham (VA Rotherham) hosts a regional jobs board covering Rotherham, Barnsley, Doncaster, Sheffield. Tracking South Yorkshire voluntary-sector hiring? Cross-CVS comparisons across the region? Looking for partner orgs by employer?

- 🎯 **Two starting points.** The `/job` listing URL (default — gives everything) or any direct `/job/<slug>` URL (filters the listing to that one).
- ⚡ **Single HTTP call.** The Wix-rendered listing inlines every job as CMS cards — one ~1.3 MB fetch returns the whole board.
- 🏷️ **Card extraction by anchor pattern.** Wix's DOM order ≠ visual order, so we scan rich-text strings for the `Location → Employer` anchor pattern that uniquely identifies each card.
- 📅 **Closing date inline.** Each card's "Closing : DD MMM YYYY" line is parsed into `closingDate`.
- 🏙️ **South Yorkshire focus.** Rotherham, Barnsley, Doncaster, Sheffield voluntary orgs — the four-borough city region.
- 📤 **Clean exports.** One row per vacancy. JSON + CSV exported automatically.

#### 🎯 Use cases

| Team | What they build |
|------|-----------------|
| **Regional CVS network** | Cross-borough voluntary-sector hiring intelligence |
| **South Yorkshire researchers** | Voluntary-sector pay benchmarks across the city region |
| **Sector recruiters** | Daily new-vacancy feeds for Rotherham / Barnsley / Doncaster / Sheffield charities |
| **Partner-mapping projects** | "Which charities are hiring this month" snapshots |
| **Funders** | Sector activity tracking by employer |

#### 📥 Supported inputs

| URL pattern | Behaviour |
|---|---|
| `https://www.varotherham.org.uk/job` | **Full listing** (default) |
| `https://www.varotherham.org.uk/job/<slug>` | **Single job** — fetches the listing and filters to that slug |

Leave `startUrls` empty for the full listing.

**Not supported:** detail-page enrichment (Wix is a SPA — the `/job/<slug>` URL serves the same shell as `/job`); hosts outside `varotherham.org.uk`.

#### 🔄 How it works

1. **Fetch the `/job` listing once** (~1.3 MB SSR'd HTML).
2. **Extract all rich-text strings in document order** — every Wix CMS field value.
3. **Find the `Location → <value> → Employer → <value>` anchor pattern** in the stream — that's the per-card signature.
4. **Pull title + closing date** from the strings immediately before the anchor (title is always 1-2 positions before `Location`).
5. **Pair cards with `/job/<slug>` hrefs** in document order — one row per slug.

#### ⚙️ Input parameters

| Parameter | Type | Default | Description |
|---|---|---|---|
| `startUrls` | array | `["https://www.varotherham.org.uk/job"]` | Listing URL or single-job URLs (slug filter). |
| `maxItems` | integer | `1000` | Hard cap on rows pushed (~17 live). |
| `maxConcurrency` | integer | `1` | Reserved — single page is fetched once. |
| `maxRequestRetries` | integer | `5` | Retries before the listing fetch is given up. |
| `proxy` | object | No proxy | Wix CDN does not anti-bot. |

#### 📊 Output overview

Each scraped vacancy is one **single dataset row** of `type: "job"`. Title + employer + location + closing date come from the inline Wix card. `description` and `salary` are null — Wix's SPA architecture means the detail body lives behind JS routing and isn't reachable with HTTP-only tools.

#### 📦 Output sample

```json
{
  "type": "job",
  "source": "varotherham.org.uk",
  "jobId": "administrative-assistant",
  "slug": "administrative-assistant",
  "jobUrl": "https://www.varotherham.org.uk/job/administrative-assistant",
  "title": "Administrative Assistant",
  "description": null,
  "descriptionText": null,
  "companyName": "Roundabout Ltd",
  "companyWebsite": null,
  "companyDomain": null,
  "location": "Sheffield",
  "remote": false,
  "salary": null,
  "categories": [],
  "employmentTypes": [],
  "contractType": null,
  "status": "publish",
  "postedDate": null,
  "closingDate": "20th May 2026",
  "modifiedDate": null,
  "applyType": "internal",
  "applyUrl": "https://www.varotherham.org.uk/job/administrative-assistant",
  "applyEmail": null,
  "externalApplyUrl": null,
  "scrapedAt": "2026-05-20T00:13:00.000Z"
}
````

#### 🗂 Key output fields

| Group | Fields |
|---|---|
| **Identifiers** | `type`, `source`, `jobId`, `slug`, `jobUrl`, `scrapedAt` |
| **Content** | `title` (from Wix card) |
| **Dates** | `closingDate` (e.g. "20th May 2026") |
| **Employer** | `companyName` (from card "Employer" field) |
| **Location** | `location` (from card "Location" field) |
| **Apply flow** | `applyType`, `applyUrl` (the VA Rotherham page) |

#### ❓ FAQ

**Why is `description` always null?**
VA Rotherham runs on Wix, a single-page application. The `/job/<slug>` URL serves the same HTML shell as `/job` — the actual job body is loaded by JavaScript only. To get the full description you'd need a browser automation tool (Playwright). For most use cases (inventory, employer tracking, closing-date monitoring), the listing data is enough.

**Why is `salary` always null?**
Same reason as above — salary lives in the detail body which we don't render.

**How is closing date formatted?**
As Wix renders it: "20th May 2026". We don't parse to ISO because the format is inconsistent across cards.

**Can I scrape private pages or applicant data?**
No. Only the public `/job` listing.

**How do I limit results?**
Set `maxItems`. Single-fetch design = `maxItems` is purely a row-pushing cap.

#### 💬 Support

- For issues or feature requests, please use the **Issues** tab on the actor's Apify Console page.
- Author's website: <https://muhamed-didovic.github.io/>
- Email: <muhamed.didovic@gmail.com>

#### 🛠 Additional services

- Custom output shape, additional fields, or one-off datasets: <muhamed.didovic@gmail.com>
- Similar scrapers for other CVS / volunteer hubs (Barnsley CVS, Doing Good Leeds, VAS Sheffield, York CVS): drop an email.
- For API access (no Apify fee, just usage): <muhamed.didovic@gmail.com>

#### 🔎 Explore more scrapers

See other scrapers at [memo23's Apify profile](https://apify.com/memo23) — covering job boards, real estate, social media, and more.

***

### ⚠️ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Voluntary Action Rotherham (VA Rotherham), varotherham.org.uk, Wix.com, or any of their subsidiaries or affiliates. All trademarks mentioned are the property of their respective owners.

The scraper accesses only the publicly available `/job` listing page on varotherham.org.uk — no authenticated endpoints, recruiter-only features, or content behind a login. Users are responsible for ensuring their use complies with varotherham.org.uk's Terms of Service, applicable data-protection law (GDPR, CCPA, etc.), and any contractual obligations of their own organisation.

***

### SEO Keywords

va rotherham scraper, scrape varotherham.org.uk, va rotherham jobs api, voluntary action rotherham scraper, rotherham voluntary sector jobs api, south yorkshire charity jobs scraper, Apify va rotherham, rotherham nonprofit jobs scraper, barnsley jobs scraper, doncaster charity jobs api, sheffield voluntary sector jobs, south yorkshire third sector hiring data, wix cms scraper, wix collection scraper, charityjob alternative scraper, doing good leeds alternative scraper, vassheffield alternative scraper, barnsleycvs alternative scraper, uk cvs jobs scraper, regional voluntary sector recruitment data

# Actor input Schema

## `startUrls` (type: `array`):

Supported shapes: `https://www.varotherham.org.uk/job`, `https://www.varotherham.org.uk/job/<slug>`. Leave empty for the full listing.

## `maxItems` (type: `integer`):

Hard cap on rows pushed. VA Rotherham typically lists ~18 live vacancies.

## `maxConcurrency` (type: `integer`):

Reserved — VA Rotherham fetches a single page. Default 1.

## `minConcurrency` (type: `integer`):

Reserved.

## `maxRequestRetries` (type: `integer`):

Retries before the listing fetch is given up.

## `proxy` (type: `object`):

Wix CDN does not anti-bot — proxy is optional.

## Actor input object example

```json
{
  "startUrls": [
    "https://www.varotherham.org.uk/job"
  ],
  "maxItems": 1000,
  "maxConcurrency": 1,
  "minConcurrency": 1,
  "maxRequestRetries": 5,
  "proxy": {
    "useApifyProxy": false
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://www.varotherham.org.uk/job"
    ],
    "proxy": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("memo23/varotherham-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": ["https://www.varotherham.org.uk/job"],
    "proxy": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("memo23/varotherham-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://www.varotherham.org.uk/job"
  ],
  "proxy": {
    "useApifyProxy": false
  }
}' |
apify call memo23/varotherham-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=memo23/varotherham-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "VA Rotherham Jobs Scraper",
        "description": "Scrape the varotherham.org.uk South Yorkshire voluntary-sector job board (Wix CMS). One HTTP request, every job inline: title, employer, location, closing date. Rotherham / Barnsley / Doncaster / Sheffield charities. JSON or CSV out, billed per result.",
        "version": "0.0",
        "x-build-id": "ctDWhEIEaMHKfdvb7"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/memo23~varotherham-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-memo23-varotherham-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/memo23~varotherham-scraper/runs": {
            "post": {
                "operationId": "runs-sync-memo23-varotherham-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/memo23~varotherham-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-memo23-varotherham-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "varotherham.org.uk URLs",
                        "type": "array",
                        "description": "Supported shapes: `https://www.varotherham.org.uk/job`, `https://www.varotherham.org.uk/job/<slug>`. Leave empty for the full listing.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Maximum jobs to scrape",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Hard cap on rows pushed. VA Rotherham typically lists ~18 live vacancies.",
                        "default": 1000
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Reserved — VA Rotherham fetches a single page. Default 1.",
                        "default": 1
                    },
                    "minConcurrency": {
                        "title": "Min concurrency",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Reserved.",
                        "default": 1
                    },
                    "maxRequestRetries": {
                        "title": "Max request retries",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Retries before the listing fetch is given up.",
                        "default": 5
                    },
                    "proxy": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Wix CDN does not anti-bot — proxy is optional.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
