Bassetlaw CVS Jobs Scraper avatar

Bassetlaw CVS Jobs Scraper

Pricing

from $1.99 / 1,000 results

Go to Apify Store
Bassetlaw CVS Jobs Scraper

Bassetlaw CVS Jobs Scraper

Scrape bcvs.org.uk (Bassetlaw + Bolsover Drupal 10 site). HTML scrape captures title, body, closing date, contact email, AND the job-spec PDF attachment URL for downstream OCR. ~3-10 live vacancies. No anti-bot. JSON or CSV out, billed per result.

Pricing

from $1.99 / 1,000 results

Rating

0.0

(0)

Developer

Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

7 days ago

Last modified

Categories

Share

Scrape the Bassetlaw Community & Voluntary Service jobs board at bcvs.org.uk/latest-vacancies. Drupal 10 site with REST API disabled — the actor HTML-scrapes the listing for /job/<slug> URLs and follows each detail page for body text, closing date, contact email/website, and the job-spec PDF attachment URL. JSON or CSV out, no compute charge per run, just per result.

How it works

How Bassetlaw CVS Scraper works

✨ Why use this scraper?

BCVS hosts the voluntary-sector jobs board for Bassetlaw and Bolsover (north Nottinghamshire / Derbyshire). Tracking the local third-sector? Building a CVS network dashboard? Sourcing for paid roles at local charities?

  • 🎯 Two starting points. The /latest-vacancies listing URL (default) or any direct /job/<slug> URL.
  • Single HTTP call for the listing. Drupal renders all 3-10 visible jobs in one SSR'd page.
  • 📋 Detail-page enrichment. One-fetch-per-job extracts title (h1), body HTML, closing date (field-closing-date), contact email + website.
  • 📄 PDF attachment URL captured. Real job specs live in PDFs at field-person-specification. We extract the URL + filename so downstream consumers can fetch / index the PDF separately.
  • 🇬🇧 Bassetlaw + Bolsover focus. Member charities — Citizens Advice, parish councils, support services, faith-based orgs.
  • 📤 Clean exports. One row per vacancy. JSON + CSV exported automatically.

🎯 Use cases

TeamWhat they build
Local CVS networkCross-region nonprofit hiring intelligence in north Notts / north Derbyshire
Sector recruitersDaily new-vacancy feeds for Bassetlaw / Bolsover charities
ResearchersLocal third-sector labour-market datasets
PDF indexing servicesAuto-collect job-spec PDFs into searchable archives
Workforce strategySalary intelligence (post-PDF-OCR) across local charities

📥 Supported inputs

URL patternBehaviour
https://www.bcvs.org.uk/latest-vacanciesFull listing (default)
https://www.bcvs.org.uk/job/<slug>Single job — fetches detail page directly

Leave startUrls empty for the full listing.

Not supported: REST API access (Drupal /jsonapi is disabled); hosts outside bcvs.org.uk.

🔄 How it works

  1. Fetch /latest-vacancies — Drupal HTML page (~33 KB).
  2. Harvest /job/<slug> anchors — typically 3-10 jobs visible.
  3. For each (when enrichDetail: true), fetch the detail page.
  4. Parse Drupal field classes:
    • <h1> → title
    • .field--name-body → body HTML + plain text
    • .field--name-field-closing-date → closing date
    • .field--name-field-person-specification a[href$=".pdf"] → job-spec PDF URL
  5. Extract contact email + website from body (first mailto: and first non-bcvs http href).
  6. Push merged row with original listing + detail enrichment.

⚙️ Input parameters

ParameterTypeDefaultDescription
startUrlsarray["https://www.bcvs.org.uk/latest-vacancies"]Listing URL or single-job URLs. Empty = listing.
enrichDetailbooleantrueWhen true, fetches each detail page. Disable for listing-only output (title + URL only).
maxItemsinteger1000Hard cap on rows pushed (typically 3-10 live).
maxConcurrencyinteger3Parallel detail-page fetch limit.
maxRequestRetriesinteger5Retries before a failed request is given up.
proxyobjectNo proxySite does not anti-bot.

📊 Output overview

Each scraped vacancy is one single dataset row of type: "job". Listing fields (slug, URL, anchor title) merged with optional detail-page enrichment (body, closing date, PDF URL, contact info).

📦 Output sample

{
"type": "job",
"source": "bcvs.org.uk",
"jobId": "clowne-parish-council-assistant-clerk",
"slug": "clowne-parish-council-assistant-clerk",
"jobUrl": "https://www.bcvs.org.uk/job/clowne-parish-council-assistant-clerk",
"title": "Clowne Parish Council - Assistant Clerk",
"description": "<p>Contact: bcvs@bcvs.org.uk…</p>",
"descriptionText": "Contact: bcvs@bcvs.org.uk…",
"companyName": null,
"companyWebsite": null,
"companyDomain": null,
"location": "Bassetlaw / Bolsover, England",
"remote": false,
"salary": null,
"salaryRaw": null,
"categories": [],
"employmentTypes": [],
"contractType": null,
"status": "publish",
"postedDate": null,
"closingDate": null,
"modifiedDate": null,
"applyType": "email",
"applyUrl": "https://www.bcvs.org.uk/job/clowne-parish-council-assistant-clerk",
"applyEmail": "bcvs@bcvs.org.uk",
"externalApplyUrl": null,
"jobSpecPdfUrl": "https://www.bcvs.org.uk/sites/default/files/2026-05/job%20advert%20final.pdf",
"jobSpecPdfFilename": "job advert final.pdf",
"scrapedAt": "2026-05-20T00:13:00.000Z"
}

🗂 Key output fields

GroupFields
Identifierstype, source, jobId (= slug), slug, jobUrl, scrapedAt
Contenttitle (from h1), description (HTML, from detail page), descriptionText (plain)
DatesclosingDate (from field-closing-date)
EmployercompanyName (null — H1 typically reads "Employer - Role"), companyWebsite, companyDomain
Locationlocation (always "Bassetlaw / Bolsover, England")
Apply flowapplyType (email/external/internal), applyUrl, applyEmail, externalApplyUrl
BCVS-specificjobSpecPdfUrl, jobSpecPdfFilename

❓ FAQ

Why is companyName always null? BCVS's Drupal field structure doesn't separate employer name from job title — the H1 reads "

Why is salary always null? Salary information lives in the PDF attachment, which we don't parse. Use jobSpecPdfUrl to fetch + parse the PDF yourself (libraries like pdf-parse / pdfjs-dist work well).

Why is closingDate sometimes null? Not all charities fill in the field-closing-date Drupal field — it's optional. Closing dates often appear in the PDF instead.

Can I scrape private pages or applicant data? No. Only the public /latest-vacancies listing and public /job/<slug> pages.

How do I limit results? Set maxItems. With only 3-10 live vacancies, maxItems: 100 covers everything safely.

💬 Support

🛠 Additional services

  • Custom output shape, additional fields, or one-off datasets: muhamed.didovic@gmail.com
  • Bundled PDF-spec OCR for a richer dataset (extracts salary, hours, role description from the PDF): drop an email.
  • Similar scrapers for other CVS / volunteer hubs (Doing Good Leeds, VA Rotherham, VAS Sheffield, Barnsley CVS, Community First Yorkshire): drop an email.

🔎 Explore more scrapers

See other scrapers at memo23's Apify profile — covering job boards, real estate, social media, and more.


⚠️ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Bassetlaw Community & Voluntary Service (BCVS), bcvs.org.uk, or any of their subsidiaries or affiliates. All trademarks mentioned are the property of their respective owners.

The scraper accesses only the publicly available /latest-vacancies listing page and public /job/<slug> detail pages on bcvs.org.uk — no authenticated endpoints, recruiter-only features, or content behind a login. Users are responsible for ensuring their use complies with bcvs.org.uk's Terms of Service, applicable data-protection law (GDPR, CCPA, etc.), and any contractual obligations of their own organisation.


SEO Keywords

bcvs scraper, scrape bcvs.org.uk, bassetlaw cvs scraper, bolsover cvs scraper, bassetlaw voluntary sector jobs api, north nottinghamshire charity jobs scraper, north derbyshire charity jobs scraper, Apify bcvs, drupal scraper, drupal 10 html scraper, drupal node scraper, pdf job spec extractor, charity jobs pdf scraper, charityjob alternative scraper, doing good leeds alternative scraper, vassheffield alternative scraper, barnsleycvs alternative scraper, va rotherham alternative scraper, uk cvs jobs scraper, local nonprofit recruitment data