Pricing

from $12.00 / 1,000 per movie scripts

Movie Script Finder & Extractor

Find publicly accessible movie scripts and screenplays, extract clean metadata, and output script text in separate chunk rows for research, indexing, and analysis.

Pricing

from $12.00 / 1,000 per movie scripts

Rating

0.0

(0)

Developer

Inus Grobler

Actor stats

Bookmarked

Total users

Monthly active users

4 days ago

Last modified

What You Get

Public screenplay discovery from supported script sources
Movie title, writers, genres, source URLs, format, draft details when available
Plain-text screenplay chunks for sources that expose readable HTML or TXT script text
Compact metadata rows for PDF, external, or metadata-only matches
Error rows for unsupported inputs, extraction failures, or no-result searches
Low-cost defaults: no browser, no proxy by default, 128 MB for single-title runs

Best For

Screenplay research datasets
Movie script search and cataloging
LLM or vector-index preparation
Writer, genre, and structure analysis
Building internal screenplay reference tools
Finding public source links for scripts at scale

Supported Sources

The Actor automatically checks supported public sources. You do not need to choose a source.

Source	Support
IMSDb	Metadata and HTML script text
The Daily Script	Metadata, HTML text, and TXT text
SimplyScripts	Metadata, TXT links, PDF links, and conservative external-link handling
Script Slug	Metadata and public PDF links when available

PDF text extraction is not enabled by default. PDF-only matches are returned as metadata/link rows.

Input

Use one of the two public input fields.

One Movie

Use movieName when you want one best-match screenplay.

{
  "movieName": "The Matrix"
}

Multiple Searches

Use searches when you want results for multiple movie titles or search terms.

{
  "searches": ["The Matrix", "Alien", "Terminator"]
}

Input Notes

If movieName and searches are both filled, movieName takes priority.
Keep movie titles specific for best matching.
Results are pushed to the dataset as they are scraped, not only after the run finishes.
Single-title runs use the cheapest defaults. Multi-search runs use more memory because they can return many scripts and chunks.

Output

Results are available in the default dataset. The Actor emits these row types:

Type	Meaning
`script_metadata`	One summary row for each matched script
`script_chunk`	Plain-text screenplay content split into ordered chunks
`script_analysis`	Optional analysis row in advanced runs
`error`	Invalid input, no results, unsupported source, or extraction failure

Unknown or unavailable success fields are omitted instead of filled with null.

Metadata Row Example

{
  "type": "script_metadata",
  "source": "imsdb",
  "scrapedAt": "2026-06-08T07:00:00.000Z",
  "scriptId": "imsdb-the-matrix",
  "scriptUrl": "https://imsdb.com/scripts/Matrix,-The.html",
  "title": "The Matrix",
  "writers": ["Larry Wachowski", "Andy Wachowski"],
  "genres": ["Action", "Sci-Fi", "Thriller"],
  "scriptFormat": "html",
  "hasScriptText": true,
  "chunkCount": 8,
  "wordCount": 23137,
  "characterCount": 143493,
  "sceneCount": 119
}

The metadata row does not contain the full script text.

Chunk Row Example

{
  "type": "script_chunk",
  "source": "imsdb",
  "scrapedAt": "2026-06-08T07:00:00.000Z",
  "scriptId": "imsdb-the-matrix",
  "scriptUrl": "https://imsdb.com/scripts/Matrix,-The.html",
  "title": "The Matrix",
  "chunkIndex": 1,
  "chunkMode": "fixed_size",
  "chunkTitle": "Chunk 1",
  "chunkText": "THE MATRIX\\n\\nWritten by Larry and Andy Wachowski...",
  "chunkCharacterCount": 19995,
  "chunkWordCount": 3300,
  "nextChunkIndex": 2
}

The default chunking is optimized for cost by using larger chunks, so fewer dataset rows are created while preserving the full extracted script text.

Error Row Example

{
  "type": "error",
  "source": "unknown",
  "scrapedAt": "2026-06-08T07:00:00.000Z",
  "url": "https://apify.com/actors/thescrapelab/screenplay-script-scraper",
  "status": "failed",
  "errorType": "NO_RESULTS",
  "errorMessage": "No matching screenplay results found for: Example Missing Movie",
  "retryable": false
}

How To Use The Results

Start the Actor from Apify Console.
Enter either a single movieName or a searches list.
Open the dataset while the run is active to see rows appear during scraping.
Use script_metadata rows for cataloging and filtering.
Use script_chunk rows for text indexing, search, LLM workflows, or downstream analysis.

Python API Example

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run_input = {
    "movieName": "The Matrix",
}

run = client.actor("thescrapelab/screenplay-script-scraper").call(run_input=run_input)

dataset_id = run["defaultDatasetId"]
items = client.dataset(dataset_id).list_items(clean=True).items

metadata_rows = [item for item in items if item.get("type") == "script_metadata"]
chunk_rows = [item for item in items if item.get("type") == "script_chunk"]

print(f"Scripts found: {len(metadata_rows)}")
print(f"Text chunks: {len(chunk_rows)}")

for row in metadata_rows:
    print(row.get("title"), row.get("scriptUrl"), row.get("wordCount"))

For multiple searches:

run_input = {
    "searches": ["The Matrix", "Alien", "Terminator"],
}

Cost And Performance

The Actor is tuned to keep run costs low:

Uses lightweight HTTP crawling, not a browser
Uses direct public requests by default, not a proxy
Uses 128 MB memory for single-title runs
Uses larger text chunks by default to reduce dataset item count
Streams rows as they are found

For a typical single-title screenplay such as The Matrix, the Actor returns one metadata row plus a small number of chunk rows while preserving the full extracted script text.

Practical Tips

Use movieName for the cheapest, most focused run.
Use searches when you want broader discovery across multiple titles.
Prefer exact titles over broad words.
Expect metadata-only rows for PDF-only or external sources.
Check hasScriptText and chunkCount to identify rows with extracted screenplay text.

Limitations

The Actor only uses publicly accessible pages.
It does not bypass paywalls, logins, CAPTCHAs, or access controls.
Source websites can change their layout, availability, or robots rules.
Some public sources expose only PDF or external links; those may return metadata rows rather than script text.
Search matching is title-oriented and may return related sequels, remakes, or same-franchise scripts.
Word counts, scene counts, and draft detection are approximate.

Legal And Ethical Notice

Movie scripts and screenplays may be copyrighted. This Actor is intended for indexing, metadata extraction, research, discovery, and analysis of publicly available pages.

You are responsible for ensuring that your use complies with copyright law, source website terms, robots.txt, and applicable regulations. The Actor is not a piracy tool and does not bypass access controls.

Support

If a title does not return the expected script, try a more exact movie title. If a source changes or a result looks wrong, rerun with a narrower query and review the source, scriptUrl, errorType, and errorMessage fields in the dataset.

Movie Scraper

tribecacitizen/movie-scraper

tribecacitizen

Movie Database API

vivid_astronaut/movie-database

Fabio Suizu

Movie News

movie-web/movie-news

movie web

IMDB Ratings

bebich/imdb-ratings

Download movie data and ratings from biggest Movie database.

ALi

Ai Video Script

vivid_astronaut/ai-video-script

Fabio Suizu

IMDB MOVIE DESCRIPTION SCRAPER

hello.datawizards/IMDB-MOVIE-DESCRIPTION-CRAPER

IMDB-MOVIE-REVIEW-SCRAPER: Scrape IMDb for movie details and reviews by movie name. Get structured JSON with titles, genres, ratings, actors, and more. Perfect for film analysis and sentiment tracking. Use Apify Proxy to avoid blocks. Ideal for entertainment research and content curation.

datawizards

IMDB Movie Scraper

getdataforme/imdb-movie-scraper

IMDB Scraper extracts all the details of the movie for which the detailed information are required for example, the rating of the movie, actors and all the details associated with the movie are fetched and presented in json or tabular format.

GetDataForMe

Producthunt Reviews Script

hello.datawizards/producthunt-reviews-script

Extract clean, structured Product Hunt review data with the Producthunt Reviews Script. Get review text, ratings, timestamps, votes, and product metadata at scale using residential proxies. Perfect for sentiment analysis, competitor research, and product intelligence.

datawizards