# Udacity Course Scraper (`shahidirfan/udacity-course-scraper`) Actor

Unlock the ultimate database of online learning! This powerful tool efficiently extracts detailed course information, syllabi, reviews, and pricing from Udacity. Ideal for market research and content aggregation. Elevate your e-learning strategy with precise, structured data at your fingertips.

- **URL**: https://apify.com/shahidirfan/udacity-course-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** Developer tools, Automation, Other
- **Stats:** 7 total users, 4 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Udacity Course Scraper

Extract comprehensive course data from Udacity's catalog at scale. Collect course information including titles, descriptions, ratings, difficulty levels, durations, and categories for education research, market analysis, and competitive intelligence.

### Features

- **Complete Course Data** — Extract all available course information including metadata, ratings, and descriptions
- **Smart Filtering** — Filter courses by keyword search and difficulty level (Beginner, Intermediate, Advanced)
- **Fast & Efficient** — Retrieve hundreds of courses in seconds using optimized data collection
- **Structured Output** — Get clean, normalized JSON data ready for analysis
- **Flexible Configuration** — Control the number of results and customize search parameters
- **Proxy Support** — Built-in proxy configuration for reliable, uninterrupted data collection

### Use Cases

#### Educational Research
Analyze trends in online education offerings, course difficulty distributions, and subject matter coverage. Build comprehensive datasets for academic research on e-learning platforms and skill development programs.

#### Competitive Analysis
Track Udacity's course catalog to understand market positioning, identify gaps in your own offerings, and monitor pricing strategies. Stay informed about new course launches and program updates.

#### Career Planning
Research available courses to plan learning paths, compare programs, and identify the most highly-rated courses in specific skill areas. Make data-driven decisions about professional development investments.

#### Market Intelligence
Monitor trends in tech education, identify emerging skills and technologies, and understand demand patterns. Build dashboards to track the evolution of online learning in specific domains.

#### Content Aggregation
Create comprehensive course databases, build course comparison platforms, or develop educational content recommendation systems using structured Udacity course data.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `keyword` | String | No | `""` | Search courses by keyword (e.g., "python", "machine learning") |
| `difficulty` | String | No | `""` | Filter by difficulty level: `Beginner`, `Intermediate`, `Advanced`, or empty for all |
| `results_wanted` | Integer | No | `20` | Maximum number of courses to collect (1-647) |
| `max_requests` | Integer | No | `10` | Safety limit for API requests (each request fetches up to 100 courses) |
| `proxyConfiguration` | Object | No | `{"useApifyProxy": true}` | Proxy settings for reliable data collection |

---

### Output Data

Each course in the dataset contains:

| Field | Type | Description |
|-------|------|-------------|
| `title` | String | Course title |
| `url` | String | Direct link to the course page |
| `slug` | String | URL-friendly course identifier |
| `summary` | String | Brief course description |
| `description` | String | Full course description |
| `difficulty` | String | Difficulty level (Beginner/Intermediate/Advanced) |
| `duration` | Number | Course duration in seconds |
| `durationFormatted` | String | Human-readable duration (e.g., "7h 56m") |
| `rating` | Number | Average user rating (0-5 scale) |
| `isFree` | Boolean | Whether the course is free to access |
| `imageUrl` | String | Course thumbnail image URL |
| `school` | String | Course category or school |
| `programType` | String | Type of program (COURSE, NANODEGREE, etc.) |
| `categoryKeys` | Array | Course category identifiers |
| `skillNames` | Array | Skills covered in the course |

---

### Usage Examples

#### Basic Course Collection

Extract the first 20 courses from Udacity's catalog:

```json
{
  "results_wanted": 20
}
````

#### Search for Python Courses

Find all Python-related courses:

```json
{
  "keyword": "python",
  "results_wanted": 50
}
```

#### Beginner-Friendly Courses

Collect beginner-level courses only:

```json
{
  "difficulty": "Beginner",
  "results_wanted": 100
}
```

#### Advanced Data Science Courses

Search for advanced data science courses:

```json
{
  "keyword": "data science",
  "difficulty": "Advanced",
  "results_wanted": 30
}
```

#### Large-Scale Collection

Collect all available courses (647 total):

```json
{
  "results_wanted": 647,
  "max_requests": 10,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

***

### Sample Output

```json
{
  "title": "Introduction to Python Programming",
  "url": "https://www.udacity.com/course/introduction-to-python--ud1110",
  "slug": "introduction-to-python--ud1110",
  "summary": "Learn Python programming from scratch with Udacity's beginner-friendly course...",
  "description": "Learn Python programming from scratch with Udacity's beginner-friendly course...",
  "difficulty": "Beginner",
  "duration": 28560,
  "durationFormatted": "7h 56m",
  "rating": 4.7,
  "isFree": true,
  "imageUrl": "https://video.udacity-data.com/topher/2024/October/6709883a_ud1110/ud1110.jpg",
  "school": "Programming & Development",
  "programType": "COURSE",
  "categoryKeys": ["programming", "python"],
  "skillNames": ["Python", "Programming Fundamentals"]
}
```

***

### Tips for Best Results

#### Choose the Right Parameters

- Start with smaller result sets (20-50) for testing and exploration
- Use specific keywords for targeted searches (e.g., "machine learning" vs "AI")
- Combine keyword search with difficulty filters for precise results
- Increase `max_requests` if collecting large datasets (200+ courses)

#### Optimize Your Searches

- Use singular keywords for broader results (e.g., "python" instead of "python programming")
- Test different difficulty levels to understand course distribution
- Leave keyword empty to collect all courses in a difficulty category
- Monitor your run statistics to understand collection efficiency

#### Handle Large Datasets

- For collecting all 647 courses, set `results_wanted` to 647 and `max_requests` to 10
- Use Apify's scheduling feature for regular catalog updates
- Enable residential proxies for reliable large-scale collection
- Export data incrementally if processing very large result sets

***

### Integrations

Connect your Udacity course data with:

- **Google Sheets** — Export for analysis and visualization
- **Airtable** — Build searchable course databases
- **Make** — Create automated workflows and notifications
- **Zapier** — Trigger actions based on new courses
- **Slack** — Get notifications about catalog updates
- **Webhooks** — Send data to custom endpoints
- **Power BI** — Build comprehensive dashboards

#### Export Formats

Download your data in multiple formats:

- **JSON** — For developers and API integrations
- **CSV** — For spreadsheet analysis in Excel or Google Sheets
- **Excel** — For business reporting and pivot tables
- **XML** — For system integrations and data pipelines

***

### Frequently Asked Questions

#### How many courses can I collect?

You can collect all available courses on Udacity (currently 647 courses). The actor automatically handles pagination and data collection across multiple requests.

#### Can I filter courses by price?

Yes, the output includes an `isFree` field that indicates whether each course is free or paid. You can filter the results after collection based on this field.

#### How often is the course data updated?

The actor fetches real-time data directly from Udacity's catalog, so you always get the most current information available. Udacity typically updates their catalog periodically with new courses and program changes.

#### What if I need more than 647 courses?

The current Udacity catalog contains 647 courses. If you need historical data or archived courses, you'll need to schedule regular runs and maintain your own historical database.

#### Does the scraper handle pagination automatically?

Yes, the actor automatically handles pagination. Simply specify the number of courses you want in `results_wanted`, and the actor will make the necessary requests to collect that data.

#### Can I search for specific topics or skills?

Yes, use the `keyword` parameter to search for specific topics (e.g., "blockchain", "React", "data science"). The search works across course titles and descriptions.

#### What's the recommended proxy configuration?

For reliable results, especially when collecting large datasets, use Apify Proxy with residential IPs. This is enabled by default in the proxy configuration.

#### How long does it take to collect courses?

Collection speed depends on the number of courses requested. Typically:

- 20 courses: ~3 seconds
- 100 courses: ~5 seconds
- 647 courses (all): ~10-15 seconds

#### Can I get course syllabi or detailed content?

This actor collects catalog-level data including summaries and metadata. For detailed syllabi and lesson content, you would need to visit individual course pages.

#### What happens if a course is unavailable?

The actor gracefully handles missing data. If a field is not available for a particular course, it will be set to `null` or an empty value depending on the field type.

***

### Support

For issues, feature requests, or questions, contact support through the Apify Console.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [API Reference](https://docs.apify.com/api/v2)
- [Scheduling Runs](https://docs.apify.com/schedules)
- [Proxy Configuration](https://docs.apify.com/proxy)
- [Dataset Storage](https://docs.apify.com/storage/dataset)

***

### Legal Notice

This actor is designed for legitimate data collection purposes including research, analysis, and educational use. Users are responsible for ensuring compliance with Udacity's Terms of Service and applicable laws. Use data responsibly and respect rate limits. This tool is intended for publicly available catalog information only.

# Actor input Schema

## `keyword` (type: `string`):

Search courses by keyword (e.g., 'python', 'data science', 'machine learning'). Searches in course title and description.

## `difficulty` (type: `string`):

Filter courses by difficulty level.

## `results_wanted` (type: `integer`):

Maximum number of courses to collect. Leave at 20 for testing.

## `max_requests` (type: `integer`):

Safety limit for API requests to prevent overuse. Each request can fetch up to 100 courses.

## `proxyConfiguration` (type: `object`):

Use Apify Proxy for reliable scraping and to avoid rate limits.

## Actor input object example

```json
{
  "results_wanted": 20,
  "max_requests": 10,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keyword": "",
    "results_wanted": 20,
    "max_requests": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/udacity-course-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "keyword": "",
    "results_wanted": 20,
    "max_requests": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/udacity-course-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keyword": "",
  "results_wanted": 20,
  "max_requests": 10
}' |
apify call shahidirfan/udacity-course-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/udacity-course-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Udacity Course Scraper",
        "description": "Unlock the ultimate database of online learning! This powerful tool efficiently extracts detailed course information, syllabi, reviews, and pricing from Udacity. Ideal for market research and content aggregation. Elevate your e-learning strategy with precise, structured data at your fingertips.",
        "version": "1.0",
        "x-build-id": "i0VHRAivPD4sbVYbq"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~udacity-course-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-udacity-course-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~udacity-course-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-udacity-course-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~udacity-course-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-udacity-course-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "keyword": {
                        "title": "Search Keyword",
                        "type": "string",
                        "description": "Search courses by keyword (e.g., 'python', 'data science', 'machine learning'). Searches in course title and description."
                    },
                    "difficulty": {
                        "title": "Difficulty Level",
                        "enum": [
                            "",
                            "Beginner",
                            "Intermediate",
                            "Advanced"
                        ],
                        "type": "string",
                        "description": "Filter courses by difficulty level."
                    },
                    "results_wanted": {
                        "title": "Maximum Courses to Scrape",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of courses to collect. Leave at 20 for testing.",
                        "default": 20
                    },
                    "max_requests": {
                        "title": "Maximum API Requests",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Safety limit for API requests to prevent overuse. Each request can fetch up to 100 courses.",
                        "default": 10
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Use Apify Proxy for reliable scraping and to avoid rate limits.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
