# YouTube Video transcript scraper (`codenest/youtube-video-transcript-scraper`) Actor

Easily extract precise YouTube video transcripts with millisecond timestamps, complete video metadata, and multiple output formats including structured JSON with timestamps and plain text arrays for professional content analysis. ❤️YouTube Video transcript scraper❤️.

- **URL**: https://apify.com/codenest/youtube-video-transcript-scraper.md
- **Developed by:** [CodeNest](https://apify.com/codenest) (community)
- **Categories:** Developer tools, Automation
- **Stats:** 8 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$5.00/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## YouTube Video Transcript Scraper - Professional Text Extraction & Analysis Tool

**Accurate, reliable, and effortless extraction of YouTube video transcripts with our enterprise-grade YouTube Video Transcript Scraper. This Apify actor enables you to batch scrape video transcripts while preserving timestamps, structure, and text formatting for content analysis and repurposing.**

---

### 📋 Overview
Need to analyze video content, create subtitles, or extract valuable text data? This **YouTube Video Transcript Scraper** delivers:
  - **Timed transcripts**: Get text with precise timestamps
  - **Structured data**: Clean JSON format for easy processing
  - **Multiple output formats**: Timestamped array and plain text
  - **Video metadata**: Titles and descriptions included
  - **Bulk processing**: Handle multiple videos simultaneously

Perfect for researchers 🔬, content creators ✍️, SEO specialists 🔍, educators 🎓, and developers 💻!

### ⚡ Core Capabilities/Key Features

#### 📄 Transcript Extraction
  - **Dual Formats**: Timestamped transcripts + plain text arrays
  - **Precision Timing**: Millisecond-accurate timestamps (HH:MM:SS.mmm)
  - **Complete Coverage**: Captures all spoken content, including music notations
  - **Bulk Mode**: Process hundreds of YouTube URLs per run

#### 🏷️ Metadata Mastery
  - **Video Titles**: Full YouTube video titles
  - **Descriptions**: Complete video descriptions
  - **Source Tracking**: Original URL preservation
  - **Structured Output**: Clean, parseable JSON data

#### 🔧 Advanced Technical Features
  - **Multi-format Support**: Handles various YouTube URL formats
  - **Error Handling**: Graceful fallback for unavailable transcripts
  - **Scalable Architecture**: Efficient processing for large batches
  - **Rate Limiting**: Intelligent request management

---

### 📥 Input Configuration
Simply enter the YouTube video URLs in the Input Section then click the "Start" button and wait for the results. The **YouTube Video Transcript Scraper** supports multiple formats:

```json
{
  "video_urls": [
    {
      "url": "https://youtu.be/yPYZpwSpKmA?si=NS85AIIvc2XSiVpr"
    }
  ]
}
````

#### ⚙️ Input Specifications

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `video_urls` | Array | Yes | YouTube video URLs to process |
| `url` | String | Yes | Valid YouTube video URL (youtu.be, youtube.com, etc.) |

***

### 📤 Output Structure

The **YouTube Video Transcript Scraper** generates comprehensive output:

```json
[
  {
    "original_url": "https://youtu.be/yPYZpwSpKmA?si=NS85AIIvc2XSiVpr",
    "Title": "Rick Astley - Together Forever (Official Video) [4K Remaster]",
    "description": "The official video for \"Together Forever\" by Rick Astley...",
    "transcript": [
      {
        "timestamp": "00:00:00.080",
        "text": "[♪♪♪]"
      },
      {
        "timestamp": "00:00:19.400",
        "text": "♪ If there's anything you need ♪"
      }
    ],
    "transcript_text": [
      "[♪♪♪]",
      "♪ If there's anything you need ♪",
      "♪ All you have to do is say ♪"
    ]
  }
]
```

#### 📋 Output Field Documentation

**Video Metadata Section**
| Field | Description |
|-------|-------------|
| `original_url` | Source YouTube video URL |
| `Title` | Full YouTube video title |
| `description` | Complete video description text |

**Transcript Data**
| Field | Description |
|-------|-------------|
| `transcript` | Array of timestamped transcript segments |
| `timestamp` | Time marker (HH:MM:SS.mmm format) |
| `text` | Text content at the specific timestamp |
| `transcript_text` | Plain text array without timestamps |

***

### 🔧 Technical Features

#### 🎯 Precision Extraction

- **Accurate Timestamping**: Frame-accurate time markers
- **Text Segmentation**: Logical breakdown of speech segments
- **Symbol Preservation**: Maintains musical notations, punctuation
- **Format Consistency**: Structured, clean output format

#### 🔄 Processing Capabilities

- **Multi-URL Support**: Batch processing of video lists
- **Error Resilience**: Handles private/deleted transcripts
- **Format Detection**: Auto-detects available transcript types
- **Performance Optimization**: Fast processing even for long videos

#### 📊 Data Quality

- **Complete Coverage**: Captures all available transcript content
- **Structured Format**: Ready for database import or analysis
- **Clean Output**: Removes unnecessary formatting artifacts
- **Standard Compliance**: Follows JSON best practices

***

### 🎯 Use Cases

The **YouTube Video Transcript Scraper** is ideal for:

#### 📚 Content Analysis & Research

- **Academic Research**: Analyze speech patterns, content trends
- **Market Research**: Study product reviews, tutorials
- **Linguistic Analysis**: Process spoken language data
- **Trend Analysis**: Track topic frequency over time

#### 🎬 Media Production

- **Subtitle Creation**: Generate SRT files for videos
- **Content Repurposing**: Convert videos to blog posts
- **Script Analysis**: Compare spoken vs. written content
- **Accessibility**: Create text versions for hearing impaired

#### 🔍 SEO & Digital Marketing

- **Keyword Analysis**: Extract terms from video content
- **Content Optimization**: Use transcripts for SEO improvements
- **Competitor Analysis**: Study competitor video strategies
- **Content Planning**: Identify topics from transcript data

#### 💻 Technical Applications

- **Training Data**: For ML/NLP model training
- **Data Pipelines**: Integrate with analysis workflows
- **Application Backends**: Power subtitle features
- **Archival Systems**: Preserve video content as text

***

#### 🤔 Why Choose Our YouTube Video Transcript Scraper?

- **Reliable Accuracy**: High precision in transcript extraction
- **User-Friendly**: Simple interface for both beginners and advanced users
- **Regular Updates**: Maintained to ensure compatibility with YouTube changes
- **Comprehensive Data**: Get all transcript information in structured formats
- **Professional Grade**: Built for enterprise and research applications

***

#### ⚠️ Limitations

- Only works with YouTube videos that have available transcripts
- Requires videos to be publicly accessible
- Subject to YouTube's terms of service and rate limits
- May not capture live stream transcripts in real-time

***

### 📧 Need Customization?

Want *automated transcript processing*, **real-time transcript monitoring**, or \**custom export formats*?

✉ Email *<codenest2.0@gmail.com>* for tailored solutions!  

***

**Keywords optimized**: This **YouTube Video Transcript Scraper** provides professional-grade transcript extraction capabilities. Whether you need a simple **YouTube Video Transcript Scraper** for basic needs or an advanced **YouTube Video Transcript Scraper** for enterprise applications, our tool delivers reliable performance. The **YouTube Video Transcript Scraper** supports multiple formats and ensures your **YouTube Video Transcript Scraper** requirements are met with precision and efficiency.

# Actor input Schema

## `video_urls` (type: `array`):

List of YouTube video or Kids URLs.

## Actor input object example

```json
{
  "video_urls": [
    {
      "url": "https://www.youtube.com/watch?v=cmGo8hM_nhs"
    },
    {
      "url": "https://www.youtubekids.com/watch?v=k85mRPqvMbE"
    }
  ]
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "video_urls": [
        {
            "url": "https://www.youtube.com/watch?v=cmGo8hM_nhs"
        },
        {
            "url": "https://www.youtubekids.com/watch?v=k85mRPqvMbE"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("codenest/youtube-video-transcript-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "video_urls": [
        { "url": "https://www.youtube.com/watch?v=cmGo8hM_nhs" },
        { "url": "https://www.youtubekids.com/watch?v=k85mRPqvMbE" },
    ] }

# Run the Actor and wait for it to finish
run = client.actor("codenest/youtube-video-transcript-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "video_urls": [
    {
      "url": "https://www.youtube.com/watch?v=cmGo8hM_nhs"
    },
    {
      "url": "https://www.youtubekids.com/watch?v=k85mRPqvMbE"
    }
  ]
}' |
apify call codenest/youtube-video-transcript-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=codenest/youtube-video-transcript-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "YouTube Video transcript scraper",
        "description": "Easily extract precise YouTube video transcripts with millisecond timestamps, complete video metadata, and multiple output formats including structured JSON with timestamps and plain text arrays for professional content analysis. ❤️YouTube Video transcript scraper❤️.",
        "version": "0.0",
        "x-build-id": "lq9cxPZ9o3YiFWZtt"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/codenest~youtube-video-transcript-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-codenest-youtube-video-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/codenest~youtube-video-transcript-scraper/runs": {
            "post": {
                "operationId": "runs-sync-codenest-youtube-video-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/codenest~youtube-video-transcript-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-codenest-youtube-video-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "video_urls"
                ],
                "properties": {
                    "video_urls": {
                        "title": "YouTube URLs",
                        "type": "array",
                        "description": "List of YouTube video or Kids URLs.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
