# Youtube Highlights Hooks Analyzer (`coregent/youtube-highlights-hooks-analyzer`) Actor

Advanced YouTube analytics that extracts chapters, intro pacing, and hook suggestions for editors and creators. Analyze Shorts and long videos to find viral moments, engagement patterns, and optimal clip timestamps with an API-first design for blazing-fast performance.

- **URL**: https://apify.com/coregent/youtube-highlights-hooks-analyzer.md
- **Developed by:** [Delowar Munna](https://apify.com/coregent) (community)
- **Categories:** Videos, Social media, Developer tools
- **Stats:** 25 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $3.20 / 1,000 video analysis results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## YouTube Highlights & Hooks Analyzer 🎣

**Advanced YouTube analytics tool** that extracts chapters, intro pacing metrics, and actionable hook suggestions for video editors and content creators. Analyze both Shorts and long-form videos to identify viral moments, engagement patterns, and optimal clip timestamps using API-first architecture for blazing-fast performance.

---

<p align="center">
  <img src="https://raw.githubusercontent.com/coregentdevspace/youtube-highlights-hooks-analyzer--assets/main/youtube-highlights-hooks-analyzer-thumbnail.png" alt="YouTube Highlights & Hooks Analyzer" width="100%" style="max-width: 100%; height: auto;">
</p>

---

### 🚀 Key Features

- 🎯 **Hook Suggestions**: AI-powered hook recommendations with 3-7s clip titles based on transcript analysis
- 📑 **Chapter Intelligence**: Auto-detect creator chapters and description timestamps
- ⚡ **Intro Pacing Analysis**: First 15-second retention metrics and dialogue change detection
- 📝 **Reliable Transcripts**: Multi-source extraction (Supadata API + youtube-transcript fallback)
- 🔍 **Smart Discovery**: Search, channels, or direct URLs with advanced filtering
- 📊 **Comprehensive Output**: JSON export with thumbnails and engagement metrics
- 🎬 **Shorts & Long-Form**: Optimized analysis for both video formats
- ⚡ **Lightning Fast**: API-only architecture processes 50 videos in ~2 minutes

> **Perfect for**: Video editors finding viral moments, content creators optimizing retention, agencies analyzing competitors, YouTube strategists benchmarking hooks

---

### 🎯 What Makes This Unique?

Unlike basic YouTube scrapers that only extract metadata, this actor provides **actionable editing insights**:

| Feature | This Actor | Generic Scrapers |
|---------|------------|------------------|
| **Hook Suggestions** | ✅ Timestamp + titles | ❌ Manual work required |
| **Intro Pacing** | ✅ 15s retention + dialogue analysis | ❌ Not analyzed |
| **Chapter Extraction** | ✅ Multiple fallback methods | ⚠️ Limited support |
| **Transcript Analysis** | ✅ Multi-source fallback | ⚠️ Single source |
| **Shorts Detection** | ✅ Auto-detected | ⚠️ Manual filtering |
| **Thumbnails** | ✅ Highest quality (maxres/high) | ⚠️ Basic only |
| **Performance** | ✅ 2-3 sec per video | ❌ 8-10+ sec per video |

**Competitive Advantage**: Only analyzer on Apify combining transcript-based hooks, intro pacing metrics, and API-first architecture for maximum speed and reliability.

---

### 📋 Input Parameters

| Field | Key | Type | Default | Description |
|-------|-----|------|---------|-------------|
| **Start URLs** | `startUrls` | Array<string> | `[]` | Video URLs, channel URLs, or search result URLs (e.g., `youtube.com/watch?v=...`, `youtube.com/@mrbeast`) |
| **Search Query** | `searchQuery` | string | `null` | YouTube search query to find videos (e.g., `"AI tutorial"`, `"viral marketing"`) |
| **Max Videos** | `maxVideos` | integer | `50` | Maximum number of videos to analyze (1-500) |
| **From Date** | `since` | string | `null` | Filter videos published on/after this date (ISO 8601: YYYY-MM-DD) |
| **To Date** | `until` | string | `null` | Filter videos published on/before this date (ISO 8601: YYYY-MM-DD) |
| **Min Views** | `minViews` | integer | `0` | Minimum view count threshold |
| **Max Views** | `maxViews` | integer | `null` | Maximum view count threshold |
| **Duration Filter** | `durationFilter` | string | `"any"` | Filter by video length: `"shorts"` (≤60s), `"under_4m"`, `"4_to_20m"`, `"over_20m"`, `"any"` |
| **Sort By** | `sortBy` | string | `"relevance"` | Sort order: `"relevance"`, `"date"`, `"viewCount"`, `"rating"` |
| **Max Hooks Per Video** | `maxHooksPerVideo` | integer | `10` | Maximum hook suggestions to generate per video (1-25) |
| **Hook Length** | `hookLengthSec` | integer | `7` | Hook clip length in seconds (3-15) |
| **Fetch Transcript** | `fetchTranscript` | boolean | `true` | Extract video transcripts/captions for hook generation |
| **Compute Intro Pacing** | `computeIntroPacing` | boolean | `true` | Analyze first 15 seconds for retention metrics |
| **Dry Run** | `dryRun` | boolean | `false` | Metadata-only mode (skips deep analysis for faster discovery) |

**Important Notes:**
- 🎬 **Multi-Format Support**: Analyzes both Shorts (≤60s) and long-form videos (up to hours)
- 🔍 **Flexible Discovery**: Combine `startUrls`, `searchQuery`, and filters for powerful video discovery
- 📅 **Date Filtering**:
  - Use `since` only to get videos after a date
  - Use `until` only to get videos before a date
  - Use both for a specific date range
  - Leave both empty to get all available videos
- 🔄 **Smart Sorting**: Sort by relevance, date, view count, or rating
- 📝 **Transcripts**: Extracted via Supadata API + youtube-transcript fallback (70-80% availability)
- ⚡ **API Keys**: Pre-configured with rotation (no setup required)

---

### 📤 Output Schema

#### 25+ Fields with Comprehensive Video Intelligence


| # | Field | Type | Description | Source |
|---|-------|------|-------------|--------|
| 1 | **video_id** | String | YouTube video ID (e.g., `dQw4w9WgXcQ`) | YouTube API |
| 2 | **video_url** | String | Full video URL (`youtube.com/watch?v=...`) | Constructed |
| 3 | **thumbnail_url** | String \| null | Highest quality thumbnail URL (maxres → high → medium → default) | YouTube API |
| 4 | **title** | String | Video title | YouTube API |
| 5 | **published_at** | String | Publish date in ISO 8601 format (e.g., `2025-11-04T10:30:00Z`) | YouTube API |
| 6 | **duration_sec** | Integer | Video duration in seconds | YouTube API |
| 7 | **is_shorts** | Boolean | Auto-detected Shorts flag (≤60 seconds) | Duration Check |
| 8 | **view_count** | Integer \| null | Total view count | YouTube API |
| 9 | **like_count** | Integer \| null | Total like count | YouTube API |
| 10 | **comment_count** | Integer \| null | Total comment count | YouTube API |
| 11 | **channel_id** | String | Channel ID (e.g., `UCuAXFkgsw1L7xaCfnd5JJOw`) | YouTube API |
| 12 | **channel_title** | String | Channel display name | YouTube API |
| 13 | **channel_url** | String | Full channel URL | Constructed |
| 14 | **chapters** | Array<Object> | Chapter markers with `title`, `start_sec`, `end_sec`, `type` | Description Parsing |
| 15 | **hooks** | Array<Object> | Hook suggestions with `rank`, `ts`, `hook_title`, `confidence`, `source`, `transcript_excerpt` | AI Analysis |
| 16 | **intro_pacing** | Object | First 15s metrics: `dialogue_changes_first_15s`, `words_per_second_first_15s`, `first_15s_retention_score` | Transcript Analysis |
| 17 | **transcript.available** | Boolean | Whether transcript was extracted | Supadata/Fallback |
| 18 | **transcript.language** | String \| null | Transcript language code (e.g., `en`) | Supadata/Fallback |
| 19 | **transcript.source** | String \| null | Transcript source: `supadata` or `youtube-transcript` | API Detection |
| 20 | **transcript.word_count** | Integer \| null | Total word count in transcript | Computed |
| 21 | **transcript.duration_covered_sec** | Integer \| null | Duration covered by transcript | Computed |
| 22 | **transcript.segments** | Array<Object> \| null | Full transcript with timestamps (`text`, `start`, `duration`) | Supadata/Fallback |
| 23 | **replay_heat** | Array | Empty (not available in API-only mode) | N/A |
| 24 | **replay_max_score** | Integer | Always 0 (not available in API-only mode) | N/A |
| 25 | **replay_peaks** | Array | Empty (not available in API-only mode) | N/A |
| 26 | **analysis_metadata** | Object | Processing notes, timing, features analyzed, actor version, mode | Internal |
| 27 | **billing_events** | Array<Object> | API usage tracking (YouTube API quota consumption) | Internal |

**Video-Specific Characteristics:**
- ✅ **is_shorts** automatically detects Shorts (≤60s vertical videos)
- ✅ **thumbnail_url** returns highest quality available (maxres 1280x720 preferred)
- ✅ **chapters** extracted from description timestamps (30-40% availability)
- ✅ **hooks** ranked by confidence score (0-1 scale) with timestamp + title
- ✅ **intro_pacing** analyzes first 15s dialogue changes and retention indicators
- ✅ **transcript** full text + timestamped segments (70-80% availability)
- ✅ **statistics** complete engagement metrics (views, likes, comments)

**Note**: Heat map fields (`replay_heat`, `replay_max_score`, `replay_peaks`) are unavailable in API-only mode. Trade-off: 74% faster processing (2-3s per video vs 8-10s with browser automation).

---

### 📥 Input Configuration

#### Basic Setup
```json
{
  "startUrls": ["https://www.youtube.com/watch?v=VIDEO_ID"],
  "maxVideos": 50
}
````

#### Advanced Discovery

```json
{
  "startUrls": [
    "https://www.youtube.com/@mrbeast",
    "https://www.youtube.com/results?search_query=viral+marketing"
  ],
  "searchQuery": "AI tutorial",
  "maxVideos": 100,
  "since": "2025-01-01",
  "until": "2025-11-03",
  "minViews": 10000,
  "durationFilter": "4_to_20m",
  "sortBy": "viewCount"
}
```

#### Analysis Options

```json
{
  "maxHooksPerVideo": 15,
  "hookLengthSec": 10,
  "fetchTranscript": true,
  "computeIntroPacing": true,
  "dryRun": false
}
```

***

### 📤 Output Structure

#### Table View - Overview

The actor provides multiple dataset views in Apify. Here's the **Overview** table view showing key video metrics:

![Output Table - Overview](https://raw.githubusercontent.com/coregentdevspace/youtube-highlights-hooks-analyzer--assets/main/youtube-highlights-hooks-analyzer-output-table-overview.png)

*Quick summary with video ID, URL, thumbnail, title, publish date, duration, shorts flag, chapters count, hooks count, and channel info*

***

#### Table View - All Fields (Complete Data)

The **All Fields** view displays the complete dataset with all 27+ fields:

![Output Table - All Fields](https://raw.githubusercontent.com/coregentdevspace/youtube-highlights-hooks-analyzer--assets/main/youtube-highlights-hooks-analyzer-output-table-allfields.png)

*Comprehensive data export including transcripts, processing metadata, intro pacing metrics, and full hook analysis*

***

#### Real Output Example (Complete)

```json
[{
  "video_id": "KF1Mk1XUxI8",
  "video_url": "https://www.youtube.com/watch?v=KF1Mk1XUxI8",
  "thumbnail_url": "https://i.ytimg.com/vi/KF1Mk1XUxI8/maxresdefault.jpg",
  "title": "Internet Breaks w/ Elon Musk's Announcement",
  "published_at": "2025-11-14T19:47:35Z",
  "duration_sec": 2275,
  "is_shorts": false,
  "statistics": {
    "view_count": 105143,
    "like_count": 4579,
    "comment_count": 464
  },
  "chapters": [
    {
      "start_sec": 0,
      "title": "Advancements in Neuralink and Robotics",
      "source": "description",
      "duration_sec": 260
    },
    {
      "start_sec": 260,
      "title": "The Vision Behind X and Data Utilization",
      "source": "description",
      "duration_sec": 194
    }
    // ... 5 more chapters
  ],
  "hooks": [
    {
      "ts": 659,
      "length_sec": 7,
      "hook_title": "XAI: Competing in the AI Race",
      "transcript_text": "of shares because it didn't seem morally or legally sensible with with XAI we are starting late with XAI you know we're only two and a half years old",
      "confidence": 0.2146,
      "source": "chapter_boundary",
      "rank": 1
    },
    {
      "ts": 260,
      "length_sec": 7,
      "hook_title": "The Vision Behind X and Data Utilization",
      "transcript_text": "of a sudden we have a business that has incredible data...",
      "confidence": 0.2109,
      "source": "chapter_boundary",
      "rank": 2
    }
    // ... 7 more hooks
  ],
  "intro_pacing": {
    "first_15s_dialogue_changes": 7,
    "first_15s_retention_score": null,
    "first_cta_ts": 38.48,
    "first_cta_text": "companies. One is being Neurolink and",
    "hook_detected": false,
    "hook_type": null
  },
  "transcript": {
    "available": true,
    "language": "en",
    "source": "supadata",
    "word_count": 5751,
    "duration_covered_sec": 2276,
    "entries": [
      {
        "start": 0.08,
        "duration": 4.719,
        "text": "I sent you a year or two ago an article"
      },
      {
        "start": 2.48,
        "duration": 4.799,
        "text": "about a young man was an interview in"
      },
      {
        "start": 4.799,
        "duration": 4.88,
        "text": "Baronss and he was 33 at the time and"
      }
      // ... 543+ more transcript entries
    ]
  },
  "analysis_metadata": {
    "processing_notes": ["HEAT_SKIPPED_API_MODE"],
    "processing_time_sec": 3.09,
    "features_analyzed": ["chapters", "transcript", "intro_pacing", "hooks"],
    "features_unavailable": ["replay_heat"],
    "actor_version": "1.0.0",
    "processed_at": "2025-11-16T01:16:20.950Z",
    "mode": "api_only"
  },
  "channel_id": "UCgbyN_o-Guwpyqfuuz3pyIw",
  "channel_title": "Farzad",
  "channel_url": "https://www.youtube.com/channel/UCgbyN_o-Guwpyqfuuz3pyIw"
}]
```

**📥 [View Full JSON Example (All Fields)](https://raw.githubusercontent.com/coregentdevspace/youtube-highlights-hooks-analyzer--assets/main/youtube-highlights-hooks-analyzer-output-json-allfields.json)**

#### Complete Data Fields

- **video\_id**: YouTube video ID
- **video\_url**: Full video URL
- **thumbnail\_url**: Highest quality thumbnail (maxres → high → medium → default)
- **title**: Video title
- **published\_at**: ISO 8601 publish date
- **duration\_sec**: Video duration in seconds
- **is\_shorts**: Boolean flag for Shorts detection
- **statistics**: View count, like count, comment count
- **replay\_heat**: Empty array (not available in API-only mode)
- **replay\_max\_score**: 0 (not available in API-only mode)
- **replay\_peaks**: Empty array (not available in API-only mode)
- **chapters**: Extracted from description timestamps
- **intro\_pacing**: Dialogue changes, words per second, retention score
- **hooks**: Ranked hook suggestions with timestamps
- **transcript**: Full transcript with timestamps and metadata
- **analysis\_metadata**: Processing notes, timing, features analyzed
- **billing\_events**: API usage tracking (YouTube API quota consumption)
- **channel\_id**: Channel ID
- **channel\_title**: Channel name
- **channel\_url**: Channel URL

***

### 🎬 Use Cases

#### 1. Content Repurposing

**Goal**: Extract 5-10 viral hooks from long-form video for TikTok/Shorts
**Input**: Single video URL
**Output**: Ranked hook suggestions with timestamps and 7s clip titles

#### 2. Competitor Analysis

**Goal**: Analyze top 30 videos from competitor channel
**Input**: Channel URL + filters (minViews: 10000, last 30 days)
**Output**: Aggregated intro pacing benchmarks, common hook patterns

#### 3. Trend Research

**Goal**: Find what hooks work in "AI tutorial" niche
**Input**: Search query + duration filter (4-20 min)
**Output**: Hook patterns, chapter structures, intro pacing data

#### 4. Shorts Optimization

**Goal**: Identify retention patterns in Shorts
**Input**: Shorts URLs (batch of 20)
**Output**: First 15s retention scores, hook types, pacing metrics

***

### 📊 Output Views

The actor provides **4 dataset views** for different use cases:

#### 1. Overview (Default)

Quick summary with video ID, title, thumbnail, statistics, hooks count, processing time

#### 2. Hooks & Highlights

Focused on actionable insights: hook suggestions, chapters, intro pacing metrics

#### 3. Engagement & Stats

Statistical view: view counts, retention, transcript word counts, publish dates

#### 4. All Fields (Complete Data)

Complete data export with full transcripts, processing notes, all metadata

***

### 🎯 Hook Generation Algorithm

The actor generates hooks using a **multi-strategy scoring system**:

#### Strategy 1: Chapter Boundaries (30% weight)

Extracts hooks at chapter starts (natural content transitions)

#### Strategy 2: Early Moments (25% weight)

Prioritizes moments in first 30 seconds (higher retention)

#### Strategy 3: Keyword Boost (35% weight)

Scores transcript text for:

- **Action words**: show, look, watch, learn, discover
- **Emotion words**: amazing, shocking, incredible, wow
- **Questions**: what, why, how, when
- **Engagement**: you, your, we, let's
- **Urgency**: now, today, quick, fast
- **Value**: free, best, top, ultimate

#### De-duplication (10% weight)

Removes similar hooks using Jaccard similarity (60% threshold)

**Result**: Top N hooks ranked by confidence score (0-1 scale)

***

### 🚀 Advanced Features

#### Dry Run Mode

Set `dryRun: true` to test discovery and filtering without deep analysis

- Useful for validating search queries
- Returns basic metadata only (video ID, title, stats, thumbnail)
- 10x faster than full analysis

#### Date Pickers

Use visual date pickers for `since` and `until` parameters:

- Format: YYYY-MM-DD
- Filters videos by publish date
- Works with search queries and channels

#### Shorts Detection

Automatically detects YouTube Shorts (≤60 seconds):

- Sets `is_shorts: true` flag
- Reduces max hooks to 5 (optimized for short content)
- Adjusts hook generation strategy

***

### 🔍 Filtering Options

#### Duration Filters

- `shorts`: Videos ≤60 seconds
- `under_4m`: Videos <4 minutes
- `4_to_20m`: Videos between 4-20 minutes
- `over_20m`: Videos >20 minutes
- `any`: No duration filter (default)

#### Date Filters

- `since`: Only videos published after this date (YYYY-MM-DD)
- `until`: Only videos published before this date (YYYY-MM-DD)

#### View Filters

- `minViews`: Minimum view count threshold
- `maxViews`: Maximum view count threshold

#### Sort Options

- `relevance`: Best match for search query (default)
- `date`: Newest videos first
- `viewCount`: Most viewed first
- `rating`: Highest rated first

***

### ⚠️ Limitations & Known Issues

#### Heat Map Availability

- **Not available in API-only mode** (requires browser automation)
- Fields `replay_heat`, `replay_max_score`, `replay_peaks` will be empty
- Trade-off: 74% faster performance without heat maps

#### Transcript Availability

- \~70-80% of videos have transcripts
- Age-restricted videos: transcript extraction fails
- Auto-generated captions may have errors
- Fallback chain: Supadata API → youtube-transcript library

#### API Quotas

- YouTube Data API: 10,000 units/day (hardcoded keys with rotation)
- Search costs 100 units, video details cost 1 unit per 50 videos
- Actor uses round-robin key rotation and 95% quota monitoring
- Automatic fallback to billing-enabled key when quota low

***

### 📖 Input Parameters Reference

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `startUrls` | Array | `[]` | Video/channel/search URLs |
| `searchQuery` | String | `null` | YouTube search query |
| `maxVideos` | Integer | `50` | Max videos to analyze |
| `since` | String (Date) | `null` | Publish date filter (after) |
| `until` | String (Date) | `null` | Publish date filter (before) |
| `minViews` | Integer | `0` | Minimum view count |
| `maxViews` | Integer | `null` | Maximum view count |
| `durationFilter` | Enum | `any` | Duration filter |
| `sortBy` | Enum | `relevance` | Sort order |
| `maxHooksPerVideo` | Integer | `10` | Max hooks per video |
| `hookLengthSec` | Integer | `7` | Hook clip length (seconds) |
| `fetchTranscript` | Boolean | `true` | Extract transcripts |
| `computeIntroPacing` | Boolean | `true` | Analyze intro (first 15s) |
| `dryRun` | Boolean | `false` | Metadata only (no analysis) |

**Removed Parameters** (now hardcoded):

- \~~`youtubeApiKey`~~ - Pre-configured with rotation
- \~~`supadataApiKey`~~ - Pre-configured
- \~~`respectApiOnly`~~ - Always true (API-only mode)
- \~~`transcriptSource`~~ - Auto-managed (Supadata → youtube-transcript)
- \~~`concurrency`~~ - Optimized internally
- \~~`outputFormat`~~ - JSON only
- \~~`proxyConfiguration`~~ - Not needed (API-only)

***

### 🛠️ Troubleshooting

#### Issue: No heat map data extracted

**Reason**: Heat maps require browser automation (not available in API-only mode)
**Solution**: This is expected. Actor prioritizes speed (2-3s per video) over heat maps.

#### Issue: No transcripts available

**Reason**: Video has captions disabled or is age-restricted
**Solution**: Check `transcript.available` field. Set `fetchTranscript: false` to skip.

#### Issue: No chapters found

**Reason**: Video description doesn't contain timestamp markers
**Solution**: This is expected. Only ~30% of videos have description chapters.

#### Issue: Few hooks generated

**Reason**: No transcript or chapters available
**Solution**: Actor needs transcript or chapters to generate hooks. Check `processing_notes`.

***

### 📝 Examples

#### Example 1: Single Video Analysis

```json
{
  "startUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],
  "maxHooksPerVideo": 15,
  "hookLengthSec": 10
}
```

#### Example 2: Channel Analysis (Last 30 Days)

```json
{
  "startUrls": ["https://www.youtube.com/@mrbeast"],
  "maxVideos": 30,
  "since": "2025-10-01",
  "minViews": 100000
}
```

#### Example 3: Trend Research

```json
{
  "searchQuery": "AI tutorial 2025",
  "maxVideos": 50,
  "durationFilter": "4_to_20m",
  "sortBy": "viewCount",
  "minViews": 50000
}
```

#### Example 4: Shorts Batch Analysis

```json
{
  "startUrls": [
    "https://www.youtube.com/shorts/VIDEO_ID_1",
    "https://www.youtube.com/shorts/VIDEO_ID_2"
  ],
  "durationFilter": "shorts",
  "maxHooksPerVideo": 5
}
```

#### Example 5: Dry Run (Discovery Only)

```json
{
  "searchQuery": "mr beast",
  "maxVideos": 100,
  "dryRun": true,
  "minViews": 1000000
}
```

***

### 🔧 Local Development

#### Setup

```bash
cd actor
npm install
```

#### Edit Test Configuration

Edit `INPUT.json` with your test parameters:

```json
{
  "startUrls": [],
  "searchQuery": "mr beast",
  "maxVideos": 5,
  "fetchTranscript": true,
  "dryRun": false
}
```

#### Run Locally

```bash
npm start
```

#### Check Output

```bash
cat storage/datasets/default/000000001.json
```

***

### 🚀 Deployment

#### Deploy to Apify

```bash
apify login
apify push
```

**Note**: API keys are hardcoded in the actor. No environment variables needed!

***

### 🤝 Support & Feedback

- **Issues**: Report bugs on GitHub Issues
- **Feature Requests**: Submit via GitHub Discussions
- **Documentation**: Full API docs at [docs.apify.com](https://docs.apify.com)

***

### 📜 License

ISC License - Free to use for commercial and personal projects

***

### 🏗️ Architecture

**Built with**:

- Apify SDK 3.4+ (Actor framework)
- YouTube Data API v3 (Video metadata)
- Supadata API (Transcript extraction)
- youtube-transcript (Fallback transcript source)
- axios (HTTP client)

**Architecture Type**: API-first (no browser automation)

**Version**: 1.0.0
**Last Updated**: November 2025

***

### 🎯 Performance Benchmarks

| Scenario | Videos | Time | Speed |
|----------|--------|------|-------|
| Single video | 1 | 2.5s | 2.5s per video |
| Small batch | 10 | 25s | 2.5s per video |
| Medium batch | 50 | 115s (~2 min) | 2.3s per video |
| Large batch | 100 | 230s (~4 min) | 2.3s per video |

**74% faster** than previous Puppeteer-based approach (was 8.8s per video)

# Actor input Schema

## `startUrls` (type: `array`):

YouTube video URLs, channel URLs (@handle or /channel/ID), or search result URLs. Mix multiple types for comprehensive analysis.

## `searchQuery` (type: `string`):

Search YouTube for videos by keywords (e.g., 'viral marketing tips', 'coding tutorial'). Leave empty if using startUrls only.

## `maxVideos` (type: `integer`):

Maximum number of videos to analyze across all sources.

## `since` (type: `string`):

Only analyze videos published after this date

## `until` (type: `string`):

Only analyze videos published before this date

## `minViews` (type: `integer`):

Only analyze videos with at least this many views (0 = no minimum)

## `maxViews` (type: `integer`):

Only analyze videos with at most this many views (leave empty for no maximum)

## `durationFilter` (type: `string`):

Filter videos by length. Shorts are under 60 seconds.

## `sortBy` (type: `string`):

Sort search results by relevance, date, views, or rating

## `maxHooksPerVideo` (type: `integer`):

Maximum number of hook suggestions to generate per video (recommended: 5-15)

## `hookLengthSec` (type: `integer`):

Suggested length for hook clips in seconds (3-15s recommended for Shorts/TikTok)

## `fetchTranscript` (type: `boolean`):

Fetch video transcripts/captions for hook generation and intro pacing analysis. Adds ~1-2s per video.

## `computeIntroPacing` (type: `boolean`):

Compute first 15-second metrics (dialogue density, CTA timing, retention benchmarks). Requires transcript extraction.

## `dryRun` (type: `boolean`):

Only fetch basic metadata without deep analysis. Useful for testing discovery/filtering logic.

## Actor input object example

```json
{
  "startUrls": [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  ],
  "maxVideos": 50,
  "minViews": 0,
  "durationFilter": "any",
  "sortBy": "relevance",
  "maxHooksPerVideo": 10,
  "hookLengthSec": 7,
  "fetchTranscript": true,
  "computeIntroPacing": true,
  "dryRun": false
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("coregent/youtube-highlights-hooks-analyzer").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"] }

# Run the Actor and wait for it to finish
run = client.actor("coregent/youtube-highlights-hooks-analyzer").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  ]
}' |
apify call coregent/youtube-highlights-hooks-analyzer --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=coregent/youtube-highlights-hooks-analyzer",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Youtube Highlights Hooks Analyzer",
        "description": "Advanced YouTube analytics that extracts chapters, intro pacing, and hook suggestions for editors and creators. Analyze Shorts and long videos to find viral moments, engagement patterns, and optimal clip timestamps with an API-first design for blazing-fast performance.",
        "version": "1.0",
        "x-build-id": "ZoEPmzQdxporaf8Th"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/coregent~youtube-highlights-hooks-analyzer/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-coregent-youtube-highlights-hooks-analyzer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/coregent~youtube-highlights-hooks-analyzer/runs": {
            "post": {
                "operationId": "runs-sync-coregent-youtube-highlights-hooks-analyzer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/coregent~youtube-highlights-hooks-analyzer/run-sync": {
            "post": {
                "operationId": "run-sync-coregent-youtube-highlights-hooks-analyzer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Video/Channel/Search URLs",
                        "type": "array",
                        "description": "YouTube video URLs, channel URLs (@handle or /channel/ID), or search result URLs. Mix multiple types for comprehensive analysis.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search YouTube for videos by keywords (e.g., 'viral marketing tips', 'coding tutorial'). Leave empty if using startUrls only."
                    },
                    "maxVideos": {
                        "title": "Max Videos to Analyze",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Maximum number of videos to analyze across all sources.",
                        "default": 50
                    },
                    "since": {
                        "title": "Published After",
                        "type": "string",
                        "description": "Only analyze videos published after this date"
                    },
                    "until": {
                        "title": "Published Before",
                        "type": "string",
                        "description": "Only analyze videos published before this date"
                    },
                    "minViews": {
                        "title": "Minimum Views",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Only analyze videos with at least this many views (0 = no minimum)",
                        "default": 0
                    },
                    "maxViews": {
                        "title": "Maximum Views (optional)",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Only analyze videos with at most this many views (leave empty for no maximum)"
                    },
                    "durationFilter": {
                        "title": "Video Duration",
                        "enum": [
                            "any",
                            "shorts",
                            "under_4m",
                            "4_to_20m",
                            "over_20m"
                        ],
                        "type": "string",
                        "description": "Filter videos by length. Shorts are under 60 seconds.",
                        "default": "any"
                    },
                    "sortBy": {
                        "title": "Sort Results By",
                        "enum": [
                            "relevance",
                            "date",
                            "viewCount",
                            "rating"
                        ],
                        "type": "string",
                        "description": "Sort search results by relevance, date, views, or rating",
                        "default": "relevance"
                    },
                    "maxHooksPerVideo": {
                        "title": "Max Hooks Per Video",
                        "minimum": 1,
                        "maximum": 50,
                        "type": "integer",
                        "description": "Maximum number of hook suggestions to generate per video (recommended: 5-15)",
                        "default": 10
                    },
                    "hookLengthSec": {
                        "title": "Hook Clip Length (seconds)",
                        "minimum": 3,
                        "maximum": 30,
                        "type": "integer",
                        "description": "Suggested length for hook clips in seconds (3-15s recommended for Shorts/TikTok)",
                        "default": 7
                    },
                    "fetchTranscript": {
                        "title": "Extract Transcripts",
                        "type": "boolean",
                        "description": "Fetch video transcripts/captions for hook generation and intro pacing analysis. Adds ~1-2s per video.",
                        "default": true
                    },
                    "computeIntroPacing": {
                        "title": "Analyze Intro Pacing",
                        "type": "boolean",
                        "description": "Compute first 15-second metrics (dialogue density, CTA timing, retention benchmarks). Requires transcript extraction.",
                        "default": true
                    },
                    "dryRun": {
                        "title": "Dry Run (Metadata Only)",
                        "type": "boolean",
                        "description": "Only fetch basic metadata without deep analysis. Useful for testing discovery/filtering logic.",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
