# Instagram Hashtag Scraper (`datapilot/instagram-hashtag-scraper`) Actor

Instagram Hashtag Scraper
Just provide the hashtag. Post ID, caption, likes, comments, user details — all data will be collected and stored directly in your Apify dataset.
Works with residential proxies for stable and reliable scraping.
Fast, accurate, and simple optimized for hashtag-based data

- **URL**: https://apify.com/datapilot/instagram-hashtag-scraper.md
- **Developed by:** [Data Pilot](https://apify.com/datapilot) (community)
- **Categories:** Other
- **Stats:** 14 total users, 0 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$8.00/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Instagram Hashtag Scraper

🚀 **Instagram Hashtag Scraper** is a powerful Apify Actor designed to extract public posts from **Instagram** by hashtag, without using the official Instagram API. It leverages **residential proxies** to avoid IP blocks and delivers rich, structured data – perfect for **hashtag analytics**, influencer discovery, trend monitoring, and social media research.



### 🔥 Features

- **No Official API Required** – scrapes public **Instagram** content directly, serving as a true **Instagram API alternative**.
- **Smart Proxy Integration** – uses **Apify residential proxies** to avoid IP blocks and achieve **Instagram rate limit bypass**, ensuring reliable **Instagram data extraction**.
- **Hashtag‑Based Search** – enter one or more hashtags (comma‑separated or as an array) and get sample posts for each.
- **Rich Post Metadata** – extracts **post ID**, **code**, **taken_at** timestamp, **media_type** (image/carousel), **caption**, **user details** (pk, username, full_name, profile_pic_url), **like_count**, **comments_count**, **product_type**, **hashtags** used, and more.
- **Summary Statistics** – generates a summary with total posts, likes, comments, images, carousels, and averages.
- **Apify Dataset Ready** – each post is pushed as a separate dataset item for easy export (JSON, CSV, XML).
- **Async Architecture** – fast, non‑blocking **async Python scraper** built with asyncio.
- **Lightweight & Extensible** – sample data generation can be replaced with real scraping logic using tools like `instaloader`, `playwright`, or custom HTTP requests.

---

### ⚙️ How It Works

1. **Input** – Provide one or more **Instagram hashtags** (e.g., `"travel"`, `"food"`). The Actor accepts comma‑separated strings or an array.
2. **Proxy** – Actor initialises a **residential proxy** via Apify Proxy (recommended for **Instagram anti-block**).
3. **Scrape** – For each hashtag, the Actor generates sample posts (or you can replace the logic with real scraping). The current implementation demonstrates the data structure and proxy integration.
4. **Output** – Each post's data is pushed to the Apify Dataset – a perfect **Instagram data export** solution. A summary object is also pushed at the end.
5. **Finish** – Logs total scraped posts, likes, comments, and exits.

---

### 📥 Input

The Actor accepts a JSON input with the following fields:

| Field                 | Type            | Default   | Description |
|-----------------------|-----------------|-----------|-------------|
| `hashtags`            | string / array  | required  | One or more **Instagram hashtags** (e.g., `"travel, food"` or `["travel", "food"]`). |
| `useResidentialProxy` | boolean         | `true`    | Enable Apify residential proxy – recommended for **Instagram scraping**. |
| `proxyCountry`        | string          | `"US"`    | Country code for proxy (e.g., `"US"`, `"GB"`). |
| `posts_per_hashtag`   | integer         | `10`      | Number of posts to scrape per hashtag. |
| `upload_to_dataset`   | boolean         | `true`    | Whether to push results to the Apify dataset. |

**Example input:**

```json
{
  "hashtags": "travel, food",
  "posts_per_hashtag": 5,
  "useResidentialProxy": true,
  "proxyCountry": "US"
}
````

***

### 📤 Output

Each dataset item corresponds to one Instagram post from a hashtag search:

| Field                 | Type    | Description |
|-----------------------|---------|-------------|
| `id`                  | string  | Unique Instagram post ID (format: media\_id\_user\_id). |
| `code`                | string  | Shortcode of the post (used in URLs). |
| `taken_at`            | string  | ISO timestamp of when the post was published. |
| `media_type`          | int     | 1 = image, 8 = carousel (album). |
| `caption`             | string  | Post caption text. |
| `user`                | object  | Nested object containing: pk (user ID), username, full\_name, is\_private, profile\_pic\_url. |
| `like_count`          | int     | Number of likes – Instagram like count. |
| `has_liked`           | bool    | Always false (public data). |
| `product_type`        | string  | "feed" or "carousel\_container". |
| `is_paid_partnership` | bool    | Indicates if the post is a paid partnership. |
| `comments_count`      | int     | Number of comments – Instagram comment count. |
| `hashtags`            | array   | List of hashtags found in the caption. |

Additionally, a final summary item is pushed with the following fields:

| Field                      | Type    | Description |
|----------------------------|---------|-------------|
| `hashtags_scraped`         | array   | List of hashtags processed. |
| `total_hashtags`           | int     | Number of hashtags. |
| `total_posts`              | int     | Total posts scraped. |
| `total_likes`              | int     | Sum of likes across all posts. |
| `total_comments`           | int     | Sum of comments across all posts. |
| `image_count`              | int     | Number of image posts. |
| `carousel_count`           | int     | Number of carousel posts. |
| `average_likes_per_post`   | int     | Average likes per post. |
| `average_comments_per_post`| int     | Average comments per post. |
| `completed_at`             | string  | ISO timestamp of completion. |

**Example output item (post):**

```json
{
  "id": "1234567890123456789_9876543210",
  "code": "AbCdEfGhIjK",
  "taken_at": "2025-02-14T12:34:56Z",
  "media_type": 1,
  "caption": "Amazing content about #travel! 🔥\n\n#travel #instagram #explore",
  "user": {
    "pk": "9876543210",
    "username": "creator_travel_1",
    "full_name": "travel Creator 1",
    "is_private": false,
    "profile_pic_url": "https://scontent-iad3-2.cdninstagram.com/v/t51.2885-19/default.jpg"
  },
  "like_count": 123456,
  "has_liked": false,
  "product_type": "feed",
  "is_paid_partnership": false,
  "comments_count": 7890,
  "hashtags": ["travel", "instagram", "explore"]
}
```

**Example output item (summary):**

```json
{
  "hashtags_scraped": ["travel", "food"],
  "total_hashtags": 2,
  "total_posts": 10,
  "total_likes": 1250000,
  "total_comments": 45000,
  "image_count": 7,
  "carousel_count": 3,
  "average_likes_per_post": 125000,
  "average_comments_per_post": 4500,
  "completed_at": "2025-02-14T12:35:00Z"
}
```

***

### 🧰 Technical Stack

- **Language:** Python 3.11+ (async/await)
- **Core Scraper:** `instaloader`, `playwright`, or custom HTTP requests – flexible integration for Instagram data extraction.
- **Proxy:** Apify Proxy with RESIDENTIAL group – real peer IPs, high anonymity.
- **Platform:** Apify Actor – serverless, scalable, integrated with Dataset and Key‑Value Store.
- **Deployment:** One‑click run on Apify Console or via REST API.

***

### 🎯 Use Cases

- **Hashtag Analytics** – track the popularity and sentiment of specific hashtags on Instagram.
- **Trend Monitoring** – identify emerging topics and viral content by analysing posts under trending hashtags.
- **Influencer Discovery** – find top creators who frequently use certain hashtags.
- **Brand Monitoring** – see how your branded hashtag is being used by the public.
- **Competitor Research** – analyse which hashtags your competitors are targeting.
- **Content Strategy** – understand which hashtags drive the most engagement (likes, comments).
- **Academic Research** – collect datasets of Instagram posts by hashtag for social science studies.
- **Campaign Analysis** – measure the reach and engagement of marketing campaigns using specific hashtags.
- **Niche Exploration** – discover popular accounts and content in specific niches (fitness, fashion, beauty, etc.).
- **Social Listening** – monitor public conversations around your industry or products.

***

### 🚀 Quick Start

1. **Open in Apify Console** – visit the Actor page and click Try for free.
2. **Enter one or more hashtags** in the input field (e.g., `"travel, food"`).
3. **(Optional) Adjust proxy settings** – residential proxies are enabled by default.
4. **Click Start** – the Actor will generate sample posts for each hashtag.
5. **Export** – download the results as Instagram data JSON, CSV, or Excel.

You can also call this Actor programmatically via Apify SDK or REST API – ideal for automated pipelines needing a reliable Instagram hashtag scraper. Once you replace the sample logic with real scraping, you'll have a powerful tool for unlimited Instagram scraping with Instagram anti-block protection.

***

### 💎 Why This Actor?

| Feature | Benefit |
|---------|---------|
| ✅ No Instagram API quota | Scrape millions of posts by hashtag without paying – a true Instagram API alternative. |
| ✅ Residential proxies | Bypass Instagram bot detection – high success rate with Instagram residential proxy. |
| ✅ Rich post details | Get nested user info, like/comment counts, media type, captions, hashtags – complete Instagram post metrics. |
| ✅ Hashtag‑focused | Specifically designed for hashtag‑based searches – perfect for Instagram trend research. |
| ✅ Summary statistics | Automatically generates insights like total posts, likes, comments, and averages. |
| ✅ Extensible design | Easy to add real scraping logic (e.g., using `instaloader`). |
| ✅ Apify ecosystem | Seamless integration with other Actors, triggers, and webhooks. |

### 📦 Changelog

#### v1.0.0 (February 2025)

- Initial release with residential proxy support.
- Hashtag-based search functionality.
- Extracts comprehensive post metadata (user info, engagement metrics, media type, captions).
- Summary statistics with total posts, likes, comments, and averages.
- Support for single or multiple hashtags.
- Sample data generation for demo purposes.
- Easily extensible for real scraping integration.
- Full Apify Actor integration.

***

### 🧑‍💻 Support & Feedback

- **Issues & Ideas:** Open a ticket on the Apify Actor issue tracker.
- **Contributions:** Pull requests are welcome via the GitHub repository.
- **Documentation:** Visit Apify Docs for platform guides.
- **Community:** Join the Apify community forum for discussions and support.

***

# Actor input Schema

## `hashtags` (type: `string`):

Enter hashtags (separated by comma, with or without #)

## `posts_per_hashtag` (type: `integer`):

Number of posts to scrape per hashtag

## `useResidentialProxy` (type: `boolean`):

Enable Apify Residential Proxy (default: enabled)

## `proxyCountry` (type: `string`):

Select country for proxy routing

## `upload_to_dataset` (type: `boolean`):

Save results to Apify dataset

## Actor input object example

```json
{
  "hashtags": "instagram,photography,love",
  "posts_per_hashtag": 10,
  "useResidentialProxy": true,
  "proxyCountry": "US",
  "upload_to_dataset": true
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("datapilot/instagram-hashtag-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("datapilot/instagram-hashtag-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call datapilot/instagram-hashtag-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=datapilot/instagram-hashtag-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Instagram Hashtag Scraper",
        "description": "Instagram Hashtag Scraper\nJust provide the hashtag. Post ID, caption, likes, comments, user details — all data will be collected and stored directly in your Apify dataset.\nWorks with residential proxies for stable and reliable scraping.\nFast, accurate, and simple optimized for hashtag-based data",
        "version": "0.0",
        "x-build-id": "zIxMEGWJqtBAcUe18"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/datapilot~instagram-hashtag-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-datapilot-instagram-hashtag-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/datapilot~instagram-hashtag-scraper/runs": {
            "post": {
                "operationId": "runs-sync-datapilot-instagram-hashtag-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/datapilot~instagram-hashtag-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-datapilot-instagram-hashtag-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "hashtags"
                ],
                "properties": {
                    "hashtags": {
                        "title": "Hashtags",
                        "type": "string",
                        "description": "Enter hashtags (separated by comma, with or without #)",
                        "default": "instagram,photography,love"
                    },
                    "posts_per_hashtag": {
                        "title": "Posts Per Hashtag",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Number of posts to scrape per hashtag",
                        "default": 10
                    },
                    "useResidentialProxy": {
                        "title": "Use Residential Proxy",
                        "type": "boolean",
                        "description": "Enable Apify Residential Proxy (default: enabled)",
                        "default": true
                    },
                    "proxyCountry": {
                        "title": "Proxy Country",
                        "enum": [
                            "US",
                            "CA",
                            "GB",
                            "DE",
                            "NL",
                            "FR",
                            "AU",
                            "SG"
                        ],
                        "type": "string",
                        "description": "Select country for proxy routing",
                        "default": "US"
                    },
                    "upload_to_dataset": {
                        "title": "Save to Dataset",
                        "type": "boolean",
                        "description": "Save results to Apify dataset",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
