# Github Email Scraper (`louisdeconinck/github-email-scraper`) Actor

Instantly extract contributor emails and detailed profiles from any public GitHub repository or organization to supercharge your developer outreach and recruiting.

- **URL**: https://apify.com/louisdeconinck/github-email-scraper.md
- **Developed by:** [Louis Deconinck](https://apify.com/louisdeconinck) (community)
- **Categories:** Lead generation, Developer tools, Other
- **Stats:** 94 total users, 19 monthly users, 100.0% runs succeeded, 3 bookmarks
- **User rating**: 1.00 out of 5 stars

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

GitHub Email Scraper is an Apify Actor that extracts contributor emails and author information from GitHub repositories.

### 🎯 Why scrape GitHub emails?

Use cases:

- **Developer recruiting** - Find talented developers contributing to open-source projects
- **Open-source outreach** - Contact project maintainers for collaboration or sponsorship
- **Security research** - Identify contributors for responsible disclosure

### ✨ What can GitHub Email Scraper do?

This Actor allows you to:

- **Scrape emails from repositories** - Extract all contributor emails from any public GitHub repository
- **Scrape entire organizations** - Provide a GitHub username or organization to scrape all their repositories at once
- **Aggregate across multiple repos** - Process multiple repositories in one run and see which contributors work on multiple projects
- **Get rich author profiles** - Extract name, email, GitHub login, avatar, and profile URL for each contributor

### 🚀 How to scrape GitHub emails?

1. Make a free Apify account here: [https://console.apify.com/sign-up](https://console.apify.com/sign-up?fpr=7p4wu)
2. Click on "Try for free"
3. Enter GitHub IDs - these can be:
   - **Users/Organizations** (e.g., `apify` or `https://github.com/apify`)
   - **Repositories** (e.g., `apify/crawlee` or `https://github.com/apify/crawlee`)
4. Click "Start" and wait for the Actor to complete
5. Download your data in JSON, CSV or Excel format

### 💡 What data will you receive?

GitHub Email Scraper extracts detailed information about each contributor:

| Field | Description |
|-------|-------------|
| **name** | The contributor's full name from their commits |
| **email** | The contributor's email address from their commits |
| **login** | GitHub username (if linked to GitHub account) |
| **id** | GitHub user ID (if linked to GitHub account) |
| **avatar** | URL to the contributor's GitHub avatar |
| **url** | Link to the contributor's GitHub profile |

#### Output example

```json
{
    "name": "Jaroslav Hejlek",
    "email": "hejlekjaroslav@gmail.com",
    "login": "gippy",
    "id": 3171028,
    "avatar": "https://avatars.githubusercontent.com/u/3171028?v=4",
    "url": "https://github.com/gippy"
}
````

### 📥 Input

The Actor accepts these input parameters:

- `githubIds` (array, required): List of GitHub identifiers to scrape. Can be:
  - **Users/Organizations**: `apify` or `https://github.com/apify`
  - **Repositories**: `apify/crawlee` or `https://github.com/apify/crawlee`

#### Input example

```json
{
    "githubIds": [
        "apify",
        "apify/crawlee",
        "https://github.com/facebook/react"
    ]
}
```

### 💰 How much does it cost to scrape GitHub?

This actor is extremely cost-effective. Check the "Pricing" tab for more details.

With Apify's **free tier**, you get $5 of platform credits monthly for free, which you can use to test this actor for free.

Do you need to scrape more? [Upgrade to a paid plan](https://apify.com/pricing?fpr=7p4wu) which includes more platform credits and discounted pricing.

**Tips**:

- Provide multiple repositories in your input to get more value from each run.
- When scraping organizations with many repositories, consider increasing the RAM to speed up processing.

### 🔗 Integrate with your workflows

This Actor integrates seamlessly with:

- **Automation platforms** - Build no code workflows with [Make.com](https://www.make.com/en/register?pc=louisdeconinck), n8n, and Zapier
- **Webhooks** - Trigger actions when scraping completes through [webhooks](https://docs.apify.com/platform/integrations/webhooks?fpr=7p4wu)
- **Schedulers** - Run daily/weekly to track new contributors with Apify's [Scheduler](https://docs.apify.com/schedules?fpr=7p4wu)
- **API** - Start runs and access data programmatically with the [Apify API](https://docs.apify.com/api/v2#/reference/actors/run-collection/run-actor?fpr=7p4wu)
- **Google Sheets** - Export directly to spreadsheets

### 👥 Who made this Actor?

Gordian is a specialised Apify web scraping agency founded by Louis Deconinck.

Louis is a top 1% Apify developer, Oxford University IT graduate, and creator of 70+ scrapers used by 1,000+ data professionals every month. He has scraped 10,000,000+ pages bypassing the most advanced anti-scraping protections.

- Apify AI Agent Hackathon Winner
- 300+ contributions in Apify Discord
- Former senior data engineer in EU banking

Looking for a custom data solution? Get in touch.

### ❓ FAQ

#### Is it legal to scrape GitHub?

Yes, scraping publicly available data from GitHub is legal. This scraper only extracts information that is publicly visible through GitHub's API and web interface. Email addresses found in commits are publicly available as part of the git commit history.

For more information on web scraping legality, read this blog post: [Is web scraping legal?](https://blog.apify.com/is-web-scraping-legal?fpr=7p4wu)

#### Can I export data to CSV or Excel?

Yes, Apify supports exporting dataset results in multiple formats: JSON, CSV, Excel (XLSX), HTML, XML and RSS.

#### What about private repositories?

This Actor only works with public repositories. Private repositories require authentication which is not currently supported.

#### What if contributors use different emails?

The Actor deduplicates contributors by email address. If a contributor uses different emails across commits, they will appear as separate entries in the output. The `login` field can help identify if they're the same GitHub user.

#### How do I get started?

[Make a free Apify account](https://console.apify.com/sign-up?fpr=7p4wu) to claim your free $5 usage and start scraping today by clicking "Try for free".

# Actor input Schema

## `githubIds` (type: `array`):

List of GitHub identifiers. Can be users/orgs (e.g., 'apify' or 'https://github.com/apify') or repositories (e.g., 'apify/crawlee' or 'https://github.com/apify/crawlee')

## Actor input object example

```json
{
  "githubIds": [
    "apify"
  ]
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "githubIds": [
        "apify"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("louisdeconinck/github-email-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "githubIds": ["apify"] }

# Run the Actor and wait for it to finish
run = client.actor("louisdeconinck/github-email-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "githubIds": [
    "apify"
  ]
}' |
apify call louisdeconinck/github-email-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=louisdeconinck/github-email-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Github Email Scraper",
        "description": "Instantly extract contributor emails and detailed profiles from any public GitHub repository or organization to supercharge your developer outreach and recruiting.",
        "version": "1.0",
        "x-build-id": "iypm91yQPIU4lWNAU"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/louisdeconinck~github-email-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-louisdeconinck-github-email-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/louisdeconinck~github-email-scraper/runs": {
            "post": {
                "operationId": "runs-sync-louisdeconinck-github-email-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/louisdeconinck~github-email-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-louisdeconinck-github-email-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "githubIds"
                ],
                "properties": {
                    "githubIds": {
                        "title": "GitHub IDs",
                        "type": "array",
                        "description": "List of GitHub identifiers. Can be users/orgs (e.g., 'apify' or 'https://github.com/apify') or repositories (e.g., 'apify/crawlee' or 'https://github.com/apify/crawlee')",
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
