# Advanced Amazon Product Scraper (`scrapeai/advanced-amazon-product-scraper`) Actor

The scraper collects detailed product information including product title, price, rating, number of reviews, product URL, image URL, brand, availability status, and other key details from the product page, and exports the data in structured JSON format.

- **URL**: https://apify.com/scrapeai/advanced-amazon-product-scraper.md
- **Developed by:** [ScrapeAI](https://apify.com/scrapeai) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 4 total users, 0 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

$4.99/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Amazon Product Scraper
This alternate scraper is a lightweight fork of the main Apify actor. It includes simplified configuration and some experimental features. It's intended for quick deployments or testing variations of the original scraping logic.

### Features

- **Multiple Scraping Modes**:
  - Search products by keyword
  - Scrape specific product URLs directly
  - Use custom search URLs with filters
  - Scrape bestseller/category pages
- **Deep Product Scraping**: Get detailed product information including specifications, features, variants, and availability
- **Product Reviews**: Extract customer reviews with ratings, dates, and helpful votes
- **Parallel Tab Processing**: Scrapes multiple products simultaneously using browser tabs for faster execution
- **Anti-Detection**: Stealth mode with proxy support and user agent rotation
- **Pagination Support**: Automatically navigates through multiple pages

### Input

This variant of the actor uses a more concise set of parameters, optimized for quick tests.

| Field            | Type    | Description                                                                 |
|------------------|---------|-----------------------------------------------------------------------------|
| mode             | string  | One of `search`, `productUrl` or `categoryUrl`.                            |
| keywords         | string  | Search keywords (used when mode is `search`).                              |
| urls             | array   | List of full product or category URLs (mode `productUrl` or `categoryUrl`).|
| maxItems         | integer | Cap the number of results (default 25).                                    |
| scrapeDepth      | integer | How many follow‑ups to visit per product (default 1).                       |
| useProxies       | boolean | Toggle proxy rotation (default false).                                     |
| headless         | boolean | Run without a visible browser (default true).                              |

### Output

The actor returns a flat JSON object per item with only the most commonly used attributes, making downstream processing easier.

#### Fields included

- `asin` – Amazon ASIN identifier
- `title` – Product title
- `price` – Current price (numeric)
- `currency` – Currency symbol (e.g. ₹)
- `rating` – Average star rating
- `reviewsCount` – Total number of reviews
- `url` – Canonical product URL
- `thumbnail` – Small image URL
- `inStock` – Boolean availability flag

> For the alternate scraper we **do not** output deep specifications, features, or review bodies; those were removed to keep the output lightweight.

### Examples

Below are sample inputs using the **concise parameters** described above.

#### Search Mode
```json
{
  "mode": "search",
  "keywords": "wireless earbuds",
  "maxItems": 25,
  "scrapeDepth": 2,
  "useProxies": false
}
````

#### Product URL Mode

```json
{
  "mode": "productUrl",
  "urls": [
    "https://www.amazon.in/dp/B0D3DH8TSC",
    "https://www.amazon.in/dp/B09BFV96TS"
  ],
  "maxItems": 10
}
```

#### Category/Bestsellers Mode

```json
{
  "mode": "categoryUrl",
  "urls": [
    "https://www.amazon.in/gp/bestsellers/electronics/1389401031"
  ],
  "maxItems": 30,
  "useProxies": true
}
```

### Example Output JSON

```json
{
  "title": "Apple iPhone 15 (128 GB) - Black",
  "url": "https://www.amazon.in/Apple-iPhone-15-128-GB/dp/B0CHX1W1XY",
  "asin": "B0CHX1W1XY",
  "price": {
    "value": 38999,
    "currency": "₹"
  },
  "listPrice": {
    "value": 79900,
    "currency": "₹"
  },
  "inStock": true,
  "inStockText": "In stock",
  "brand": "Apple",
  "stars": 4.5,
  "reviewsCount": 10307,
  "breadCrumbs": "Electronics > Mobiles & Accessories > Smartphones",
  "thumbnailImage": "https://m.media-amazon.com/images/I/71657TiFeHL._SX679_.jpg",
  "highResolutionImages": [
    "https://m.media-amazon.com/images/I/71657TiFeHL._SL1500_.jpg"
  ],
  "features": [
    "DYNAMIC ISLAND COMES TO IPHONE 15 — Dynamic Island bubbles up alerts...",
    "INNOVATIVE DESIGN — iPhone 15 features a durable color-infused glass...",
    "48MP MAIN CAMERA WITH 2X TELEPHOTO — The 48MP Main camera shoots..."
  ],
  "attributes": [
    { "key": "OS", "value": "iOS" },
    { "key": "RAM", "value": "6 GB" },
    { "key": "Resolution", "value": "2556x1179" }
  ],
  "reviews": [
    {
      "reviewerName": "John Doe",
      "rating": 5,
      "title": "Amazing phone!",
      "body": "Best iPhone I've ever owned...",
      "date": "Reviewed in India on 15 January 2025",
      "verifiedPurchase": true,
      "helpfulVotes": 42
    }
  ]
}
```

### Why This Scraper?

This fork is designed to be **lightweight, fast, and easy to configure**. It removes unnecessary complexity while still delivering the core data most users need, making it ideal for:

- **Prototyping new data pipelines** without a long setup
- **Running quick ad‑hoc queries** on Amazon product lists
- **Testing new features or selectors** before merging back to the main actor

### Typical Use Scenarios

- 🔎 **Quick keyword searches** where only basic product info is required
- 📦 **Bulk URL processing** when a list of ASINs is already known
- ⚡ **Low‑cost, low‑latency scraping** with minimal event charges
- 🧪 **Development & QA** for teams iterating on scraping logic

### Want More Control?

If you need something beyond this simple variant, here are options:

1. **Modify the actor yourself** – it’s intentionally small and well‑documented.
2. **Fork it on Apify** and add custom scraping steps or output fields.
3. **Reach out for paid consulting** if you’d like tailored solutions or dedicated infrastructure.

# Actor input Schema

## `mode` (type: `string`):

Select the type of input you want to scrape from

## `searchQuery` (type: `string`):

Search term to find products on Amazon India (e.g., 'laptop', 'headphones'). Used when mode is 'search'.

## `productUrls` (type: `array`):

Array of Amazon India product URLs to scrape (e.g., https://www.amazon.in/dp/B0ASIN123). Used when mode is 'productUrl'.

## `searchUrl` (type: `string`):

Full Amazon India search URL (e.g., https://www.amazon.in/s?k=laptop). Used when mode is 'searchUrl'.

## `categoryUrl` (type: `string`):

Amazon India category page URL to scrape products from. Used when mode is 'categoryUrl'.

## `maxItems` (type: `integer`):

Maximum number of products to scrape

## `headless` (type: `boolean`):

Run browser in headless mode (no visible UI). Set to false for debugging.

## `deepProductScraping` (type: `boolean`):

Enable deep scraping to extract detailed product information including product specifications, features, variants, stocks, seller info, and more. If disabled, only basic info from search results is scraped.

## `productReviews` (type: `boolean`):

Extract customer reviews for each product including review text, rating, reviewer info, date, and helpful votes. Requires Deep Product Scraping to be enabled.

## `proxyConfiguration` (type: `object`):

Use Apify Proxy or custom HTTP proxy to avoid IP blocking.

## Actor input object example

```json
{
  "mode": "search",
  "searchQuery": "laptop",
  "maxItems": 5,
  "headless": true,
  "deepProductScraping": true,
  "productReviews": false,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "IN"
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQuery": "laptop",
    "maxItems": 5,
    "headless": true,
    "deepProductScraping": true,
    "productReviews": false
};

// Run the Actor and wait for it to finish
const run = await client.actor("scrapeai/advanced-amazon-product-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchQuery": "laptop",
    "maxItems": 5,
    "headless": True,
    "deepProductScraping": True,
    "productReviews": False,
}

# Run the Actor and wait for it to finish
run = client.actor("scrapeai/advanced-amazon-product-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQuery": "laptop",
  "maxItems": 5,
  "headless": true,
  "deepProductScraping": true,
  "productReviews": false
}' |
apify call scrapeai/advanced-amazon-product-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scrapeai/advanced-amazon-product-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Advanced Amazon Product Scraper",
        "description": "The scraper collects detailed product information including product title, price, rating, number of reviews, product URL, image URL, brand, availability status, and other key details from the product page, and exports the data in structured JSON format.",
        "version": "1.0",
        "x-build-id": "uaCOFfiOxaeNwLvqb"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scrapeai~advanced-amazon-product-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scrapeai-advanced-amazon-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scrapeai~advanced-amazon-product-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scrapeai-advanced-amazon-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scrapeai~advanced-amazon-product-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scrapeai-advanced-amazon-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "mode": {
                        "title": "Scraping Mode",
                        "enum": [
                            "search",
                            "productUrl",
                            "searchUrl",
                            "categoryUrl"
                        ],
                        "type": "string",
                        "description": "Select the type of input you want to scrape from",
                        "default": "search"
                    },
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search term to find products on Amazon India (e.g., 'laptop', 'headphones'). Used when mode is 'search'.",
                        "default": "laptop"
                    },
                    "productUrls": {
                        "title": "Product URLs",
                        "type": "array",
                        "description": "Array of Amazon India product URLs to scrape (e.g., https://www.amazon.in/dp/B0ASIN123). Used when mode is 'productUrl'.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchUrl": {
                        "title": "Search URL",
                        "type": "string",
                        "description": "Full Amazon India search URL (e.g., https://www.amazon.in/s?k=laptop). Used when mode is 'searchUrl'."
                    },
                    "categoryUrl": {
                        "title": "Category URL",
                        "type": "string",
                        "description": "Amazon India category page URL to scrape products from. Used when mode is 'categoryUrl'."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of products to scrape",
                        "default": 5
                    },
                    "headless": {
                        "title": "Headless Mode",
                        "type": "boolean",
                        "description": "Run browser in headless mode (no visible UI). Set to false for debugging.",
                        "default": true
                    },
                    "deepProductScraping": {
                        "title": "Deep Product Scraping",
                        "type": "boolean",
                        "description": "Enable deep scraping to extract detailed product information including product specifications, features, variants, stocks, seller info, and more. If disabled, only basic info from search results is scraped.",
                        "default": true
                    },
                    "productReviews": {
                        "title": "Product Reviews",
                        "type": "boolean",
                        "description": "Extract customer reviews for each product including review text, rating, reviewer info, date, and helpful votes. Requires Deep Product Scraping to be enabled.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Use Apify Proxy or custom HTTP proxy to avoid IP blocking.",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ],
                            "apifyProxyCountry": "IN"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
