# AmazonPageScraper (`devlory/amazonpagescraper`) Actor

The best amazon page scraper!
Use this scraper to get all the products on an Amazon page!

- **URL**: https://apify.com/devlory/amazonpagescraper.md
- **Developed by:** [Lorenzo Cerqua](https://apify.com/devlory) (community)
- **Categories:** E-commerce, SEO tools, Developer tools
- **Stats:** 52 total users, 1 monthly users, 59.5% runs succeeded, 2 bookmarks
- **User rating**: No ratings yet

## Pricing

from $25.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Amazon Pages Search Scraper

### What does Amazon Pages Search Scraper do?

**Amazon Pages Search Scraper** extracts structured data from Amazon search result pages across multiple marketplaces, keywords, pages, sort modes, departments, price ranges, and raw Amazon filters. It returns one clean JSON record per Amazon search/result page, including the products shown on that page, sponsored vs. organic counts, pagination details, breadcrumbs, related searches, meta tags, marketplace information, and request diagnostics.

Use it when you need to monitor Amazon SERPs, research competitors, compare product visibility across countries, collect search-page product lists, analyze sponsored placements, or build repeatable Amazon keyword research workflows on Apify.

The Actor uses lightweight HTTP requests instead of browser automation, making it fast and cost-efficient for search page extraction while still returning rich product-level data from each results page.

### Why use Amazon Pages Search Scraper?

Amazon search pages change depending on country, language, search term, department, filters, sort order, price range, proxy location, and sponsored inventory. This Actor is built to make those variables explicit in the output, so you can understand what Amazon returned for each requested page.

Use it for:

- **Amazon keyword research** by scraping search result pages for one or many keywords.
- **Marketplace comparison** across Amazon domains such as `amazon.it`, `amazon.com`, `amazon.de`, `amazon.fr`, `amazon.co.uk`, `amazon.es`, `amazon.co.jp`, and more.
- **SERP monitoring** for ranking position, sponsored placement, Prime visibility, badges, coupons, prices, ratings, and review counts.
- **Competitor discovery** by collecting products shown for commercial search terms.
- **Sponsored vs. organic analysis** using `sponsored_count`, `organic_count`, and product-level `is_sponsored`.
- **Price-filtered research** with Amazon's `low-price` and `high-price` query parameters.
- **Advanced Amazon filtering** by passing raw `rh` filters or marketplace-specific `p_*` query parameters copied from Amazon URLs.
- **Data pipelines** that need repeatable JSON output through Apify API, schedules, webhooks, integrations, or dataset exports.

### How to use Amazon Pages Search Scraper

1. Open the Actor in Apify Console.
2. Choose whether to scrape by **search terms** or by existing **Amazon search/result URLs**.
3. Add one or more keywords in `search_terms`, for example `phone`, `3d printer`, or `wireless earbuds`.
4. Select one or more Amazon marketplaces in `countries`, for example `IT`, `US`, `DE`, or `GB`.
5. Set `number_of_pages` to control how many result pages are requested per keyword and country.
6. Optionally choose a sort mode, department, price range, raw `rh` filter, or extra query parameters.
7. Enable `use_proxy` if you want the Actor to try Apify Residential Proxy for Amazon requests.
8. Start the Actor.
9. Download the dataset as JSON, CSV, Excel, HTML, or another Apify-supported format.

You can also provide existing Amazon result URLs through `start_urls`. This is useful when you already built the exact Amazon filter combination in your browser and want the Actor to scrape it repeatedly.

### Input

Configure the Actor from the **Input** tab.

| Field | Type | Required | Default | Description |
| --- | --- | --- | --- | --- |
| `start_urls` | Array of URLs | No | Empty | Existing Amazon search/result page URLs. If provided, these take priority over generated keyword searches. Each URL is expanded using `number_of_pages`. |
| `search_terms` | Array of strings | No | `["phone"]` | Keywords to search on Amazon. One page record is produced for every keyword, country, and page combination. |
| `countries` | Array of strings | No | `["IT"]` | Amazon marketplaces by ISO country code. Supported values are listed below. |
| `number_of_pages` | Integer | No | `1` | Number of Amazon result pages to request for each keyword/country combination or each start URL. Minimum `1`, maximum `20` in the Apify input schema. |
| `sort_by` | String | No | `relevance` | Search sort option. Supported values: `relevance`, `featured`, `price_low_to_high`, `price_high_to_low`, `average_customer_review`, `newest_arrivals`, `best_sellers`. |
| `department` | String | No | Empty | Optional Amazon search index, for example `electronics`, `computers`, `beauty`, `fashion`, or `grocery`. Availability varies by marketplace. |
| `min_price` | String | No | Empty | Optional Amazon `low-price` query value, for example `10` or `10.50`. |
| `max_price` | String | No | Empty | Optional Amazon `high-price` query value, for example `100` or `100.00`. |
| `rh` | String | No | Empty | Raw Amazon `rh` filter copied from a filtered Amazon URL. Useful for marketplace-specific filters. |
| `extra_query` | Object | No | `{}` | Extra URL query parameters added to generated search URLs, for example `{"p_72":"1318476031"}`. |
| `use_proxy` | Boolean | No | `false` | Tries to use Apify Residential Proxy. If proxy permission is unavailable, the Actor continues without proxy unless `proxy_required` is enabled. |
| `proxy_required` | Boolean | No | `false` | Skips requests when Apify proxy is unavailable. Enable only when you prefer no data over direct requests. |
| `proxy_country` | String | No | Empty | Optional proxy country override. Leave empty to auto-select from each Amazon marketplace country. |
| `max_retries` | Integer | No | `5` | Maximum retry attempts per Amazon page. |
| `max_concurrency` | Integer | No | `3` | Number of Amazon pages processed in parallel. Lower values are gentler and often more stable for Amazon. |

#### Supported Amazon marketplaces

| Country code | Marketplace | Currency | Language hint |
| --- | --- | --- | --- |
| `US` | `amazon.com` | `USD` | English, United States |
| `CA` | `amazon.ca` | `CAD` | English/French, Canada |
| `MX` | `amazon.com.mx` | `MXN` | Spanish, Mexico |
| `BR` | `amazon.com.br` | `BRL` | Portuguese, Brazil |
| `GB` | `amazon.co.uk` | `GBP` | English, United Kingdom |
| `DE` | `amazon.de` | `EUR` | German |
| `FR` | `amazon.fr` | `EUR` | French |
| `IT` | `amazon.it` | `EUR` | Italian |
| `ES` | `amazon.es` | `EUR` | Spanish, Spain |
| `NL` | `amazon.nl` | `EUR` | Dutch |
| `SE` | `amazon.se` | `SEK` | Swedish |
| `PL` | `amazon.pl` | `PLN` | Polish |
| `TR` | `amazon.com.tr` | `TRY` | Turkish |
| `AE` | `amazon.ae` | `AED` | English/Arabic, UAE |
| `SA` | `amazon.sa` | `SAR` | Arabic/English, Saudi Arabia |
| `EG` | `amazon.eg` | `EGP` | Arabic/English, Egypt |
| `IN` | `amazon.in` | `INR` | English/Hindi, India |
| `JP` | `amazon.co.jp` | `JPY` | Japanese |
| `SG` | `amazon.sg` | `SGD` | English, Singapore |
| `AU` | `amazon.com.au` | `AUD` | English, Australia |

### Example inputs

#### Keyword search across multiple countries

```json
{
  "search_terms": ["phone", "3d printer"],
  "countries": ["IT", "US", "DE"],
  "number_of_pages": 2,
  "sort_by": "price_low_to_high",
  "min_price": "10",
  "max_price": "300",
  "use_proxy": true,
  "proxy_required": false,
  "max_retries": 5,
  "max_concurrency": 3
}
````

This input requests:

- `phone` on Amazon Italy, United States, and Germany, pages 1-2.
- `3d printer` on Amazon Italy, United States, and Germany, pages 1-2.
- A total of `2 keywords x 3 countries x 2 pages = 12` page records.

#### Existing filtered Amazon URL

```json
{
  "start_urls": [
    {
      "url": "https://www.amazon.it/s?k=filamento+pla&rh=p_72%3A1318476031&s=review-rank"
    }
  ],
  "number_of_pages": 3,
  "use_proxy": true
}
```

When `start_urls` is provided, the Actor keeps the query parameters from the URL and expands only the `page` parameter. The example above requests pages 1-3 of the same filtered search.

#### Generated search with raw Amazon filters

```json
{
  "search_terms": ["wireless earbuds"],
  "countries": ["US"],
  "number_of_pages": 1,
  "sort_by": "average_customer_review",
  "rh": "p_72:1248879011",
  "extra_query": {
    "p_36": "1253503011"
  }
}
```

Use `rh` and `extra_query` when Amazon exposes useful filters as marketplace-specific IDs. The easiest workflow is to apply filters manually on Amazon, copy the resulting URL, and reuse its `rh` or `p_*` parameters.

### Output

The Actor stores one JSON object per scraped Amazon search/result page in the default Apify dataset. Each page item contains page-level metadata and a nested `products` array.

Simplified output example:

```json
{
  "scraped_at": "2026-06-17T10:30:00.000000+00:00",
  "country": "IT",
  "marketplace": "amazon.it",
  "currency": "EUR",
  "source_url": "https://www.amazon.it/s?k=phone",
  "final_url": "https://www.amazon.it/s?k=phone",
  "blocked_or_captcha_detected": false,
  "search_term": "phone",
  "page_number": 1,
  "requested_pages": 2,
  "sort_by": "relevance",
  "filters": {
    "search_term": "phone",
    "page": 1,
    "raw_query": {
      "k": "phone"
    }
  },
  "title": "Amazon.it : phone",
  "result_count_text": "1-48 of over 10,000 results for phone",
  "result_count_estimate": 10000,
  "products_count": 48,
  "sponsored_count": 8,
  "organic_count": 40,
  "products": [
    {
      "position": 1,
      "asin": "B0EXAMPLE1",
      "title": "Example Smartphone 128GB",
      "url": "https://www.amazon.it/dp/B0EXAMPLE1",
      "image": {
        "url": "https://m.media-amazon.com/images/I/example.jpg",
        "alt": "Example Smartphone 128GB"
      },
      "price": {
        "current": "199,99 EUR",
        "raw": "199,99 EUR",
        "symbol": "EUR"
      },
      "rating": {
        "text": "4.5 out of 5 stars",
        "value": 4.5
      },
      "reviews": {
        "text": "1,234",
        "count": 1234
      },
      "badges": ["Amazon's Choice"],
      "coupon": "Save 10%",
      "delivery": {
        "messages": ["FREE delivery Tomorrow with Prime"]
      },
      "is_sponsored": true,
      "is_prime": true,
      "is_amazon_choice": true,
      "is_best_seller": false
    }
  ],
  "pagination": {
    "current_page": 1,
    "known_pages": [1, 2, 3],
    "last_known_page": 3,
    "next_page_url": "https://www.amazon.it/s?k=phone&page=2"
  },
  "breadcrumbs": ["Electronics", "Mobile Phones"],
  "related_searches": ["phone case", "smartphone", "iphone"],
  "meta": {
    "description": "Amazon search results"
  },
  "captured_request": {
    "body_length": 523841,
    "status": 200,
    "content_type": "text/html",
    "proxy_used": true
  },
  "requested_proxy_country": "IT"
}
```

### Page-level data fields

| Field | Description |
| --- | --- |
| `scraped_at` | UTC timestamp when the page was scraped. |
| `country` | Amazon marketplace country code inferred from input or URL. |
| `marketplace` | Amazon domain, such as `amazon.it`, `amazon.com`, or `amazon.de`. |
| `currency` | Expected marketplace currency. |
| `source_url` | URL requested by the Actor. |
| `final_url` | Final URL after redirects. |
| `blocked_or_captcha_detected` | Whether the returned HTML appears to be blocked or CAPTCHA-protected. |
| `search_term` | Keyword from input or parsed from the URL query parameter `k`. |
| `page_number` | Amazon result page number. |
| `requested_pages` | Number of pages requested for the keyword/country or start URL. |
| `sort_by` | Sort option requested in the Actor input when generated from keywords. |
| `filters` | Parsed URL filters such as keyword, department, sort, page, price range, `rh`, and raw query parameters. |
| `title` | HTML page title returned by Amazon. |
| `result_count_text` | Raw result count text shown by Amazon when available. |
| `result_count_estimate` | Best numeric estimate parsed from `result_count_text`. |
| `products_count` | Number of product cards parsed from the page. |
| `sponsored_count` | Number of parsed products detected as sponsored. |
| `organic_count` | Number of parsed products not detected as sponsored. |
| `products` | Product cards extracted from the search page. |
| `pagination` | Current page, visible page numbers, last known page, and next page URL when visible. |
| `breadcrumbs` | Category or department breadcrumb text detected on the page. |
| `related_searches` | Related search suggestions detected on the page. |
| `meta` | Page meta tags. |
| `captured_request` | Request diagnostics such as response status, content type, body length, and whether proxy was used. |
| `requested_proxy_country` | Proxy country selected or requested for the page. |

### Product-level data fields

Each page record contains a `products` array. Each product object may include:

| Field | Description |
| --- | --- |
| `position` | Product position among parsed result cards on that page. |
| `asin` | Amazon Standard Identification Number detected from the result card. |
| `title` | Product title shown in the search result. |
| `url` | Absolute Amazon product URL. |
| `image` | Product image URL, alt text, original `srcset`, and high-resolution candidates when available. |
| `price.current` | Best visible current price detected in the result card. |
| `price.raw` | Raw visible price text. |
| `price.symbol` | Currency symbol when available. |
| `price.whole` | Whole price component when Amazon renders split price markup. |
| `price.fraction` | Fractional price component when Amazon renders split price markup. |
| `price.list_price` | Crossed-out/list price when visible. |
| `rating.text` | Raw rating text. |
| `rating.value` | Numeric rating value parsed from the rating text. |
| `reviews.text` | Raw review count text. |
| `reviews.count` | Numeric review count. |
| `badges` | Badges such as Amazon's Choice, Best Seller, or deal labels when visible. |
| `coupon` | Coupon text when visible. |
| `delivery.messages` | Delivery, shipping, Prime, and arrival messages detected in the card. |
| `is_sponsored` | Whether the card appears sponsored. |
| `is_prime` | Whether Prime appears on the card. |
| `is_amazon_choice` | Whether Amazon's Choice appears on the card. |
| `is_best_seller` | Whether Best Seller/Bestseller appears on the card. |
| `availability_text` | Limited stock text when visible, such as "Only 3 left". |
| `data_component_type` | Amazon result card component type attribute. |
| `data_uuid` | Amazon card UUID attribute when available. |
| `raw_classes` | Raw HTML classes from the result card, useful for debugging layout variants. |

### Sorting options

| `sort_by` value | Amazon query value | Meaning |
| --- | --- | --- |
| `relevance` | None | Amazon default relevance order. |
| `featured` | `relevanceblender` | Featured/relevance blend. |
| `price_low_to_high` | `price-asc-rank` | Sort by price ascending. |
| `price_high_to_low` | `price-desc-rank` | Sort by price descending. |
| `average_customer_review` | `review-rank` | Sort by average customer review. |
| `newest_arrivals` | `date-desc-rank` | Sort by newest arrivals. |
| `best_sellers` | `exact-aware-popularity-rank` | Sort by popularity/best sellers. |

### Working with Amazon filters

Amazon uses a mix of portable query parameters and marketplace-specific filter IDs.

Portable filters generated by the Actor:

- `k`: Search keyword.
- `page`: Result page number.
- `s`: Sort mode.
- `i`: Department/search index.
- `low-price`: Minimum price.
- `high-price`: Maximum price.
- `rh`: Raw filter expression copied from Amazon.

Marketplace-specific filters:

- Amazon often uses parameters like `p_72`, `p_36`, `p_n_feature_browse-bin`, or encoded `rh` values.
- These IDs can differ by marketplace, category, and language.
- The safest approach is to apply the filters manually on Amazon, copy the final URL, and either use it as `start_urls` or pass the relevant values through `rh` and `extra_query`.

Example:

```json
{
  "search_terms": ["filament pla"],
  "countries": ["IT"],
  "rh": "p_72:1318476031",
  "extra_query": {
    "p_36": "1631630031"
  }
}
```

### Proxy behavior

Amazon pages can vary significantly based on visitor country and can occasionally return blocked pages. For better stability, enable `use_proxy` and use Apify Residential Proxy when available.

When `proxy_country` is empty, the Actor chooses a proxy country from the Amazon marketplace:

- `amazon.it` uses `IT`
- `amazon.com` uses `US`
- `amazon.co.uk` uses `GB`
- `amazon.de` uses `DE`
- `amazon.fr` uses `FR`
- `amazon.es` uses `ES`
- `amazon.co.jp` uses `JP`
- And similarly for the other supported marketplaces

If the Actor logs an `Insufficient permissions` proxy error, the run likely does not have access to Apify Proxy, often because of token permissions or a limited-permissions run. With `use_proxy=true` and `proxy_required=false`, the scraper tries the proxy first and then falls back to direct requests. Set `proxy_required=true` when direct requests are not acceptable.

### Pricing and cost estimation

Costs depend on the number of pages requested, retries, concurrency, proxy usage, page size, and Amazon response stability. This Actor uses HTTP requests rather than browser automation, so it is generally more compute-efficient than Playwright or Puppeteer-based scraping.

Main cost drivers:

- **More keywords, countries, and pages** increase the total number of requested pages.
- **Higher retries** can improve completion rate but increase runtime.
- **Residential proxies** may add proxy usage cost but are recommended for Amazon stability.
- **Higher concurrency** may finish faster, but overly aggressive concurrency can produce more unstable Amazon responses.
- **Large result pages** increase dataset size because every page contains a nested `products` array.

For testing, start with one keyword, one country, `number_of_pages: 1`, `max_retries: 5`, and `max_concurrency: 1` or `2`. Once the output looks stable, increase the scope gradually.

### Local development

The Actor can also be run locally for smoke testing.

Install dependencies:

```bash
pip install -r requirements.txt
```

Run a local keyword scrape:

```bash
python -m src --keyword "phone" --country IT --max-pages 1 --output amazon_pages.json
```

Run multiple keywords and countries:

```bash
python -m src \
  --keyword "phone" \
  --keyword "3d printer" \
  --country IT \
  --country DE \
  --max-pages 2 \
  --sort-by price_low_to_high \
  --min-price 10 \
  --max-price 300 \
  --output amazon_pages.json
```

Run from an existing Amazon search URL:

```bash
python -m src \
  --url "https://www.amazon.it/s?k=filamento+pla&rh=p_72%3A1318476031" \
  --max-pages 2 \
  --output amazon_pages.json
```

Local mode writes a JSON file to the path passed with `--output`.

### Tips and best practices

- Use `start_urls` when you need exact Amazon filters that are difficult to reproduce manually.
- Keep `max_concurrency` modest for Amazon. Values between `1` and `3` are usually a good starting point.
- Enable `use_proxy` for more marketplace-consistent results.
- Leave `proxy_country` empty when scraping mixed marketplaces so the Actor can pick the country from each Amazon domain.
- Set `proxy_country` manually when you need all requests to come from a single country.
- Check `blocked_or_captcha_detected` before trusting a page with zero products.
- Check `captured_request.proxy_used` and `requested_proxy_country` when results differ from your browser.
- Use `result_count_estimate` as an estimate only. Amazon's displayed result counts are often rounded, localized, or approximate.
- Do not assume `position` equals absolute Amazon ranking across all pages. It is the parsed position among product cards on the current result page.
- Sponsored products, carousels, editorial widgets, and layout variants may affect visible position and product counts.
- For recurring monitoring, use Apify schedules and compare datasets over time.

### Troubleshooting

#### The dataset has no products for a page

Check `blocked_or_captcha_detected`, `captured_request.status`, and `final_url`. Amazon may have returned a CAPTCHA, an empty result page, a redirect, or a layout variant. Try enabling proxy, lowering concurrency, increasing retries, or using a more specific marketplace URL.

#### Results differ from my browser

Amazon personalizes search pages by country, language, cookies, delivery location, Prime state, availability, and A/B tests. Compare `marketplace`, `requested_proxy_country`, `filters`, and `final_url`. If you need a specific country view, use the matching marketplace and proxy country.

#### Proxy fails with insufficient permissions

Your Apify run may not have permission to use Apify Proxy. With `proxy_required=false`, the Actor falls back to direct requests. With `proxy_required=true`, requests are skipped when proxy configuration is unavailable.

#### Amazon filters do not behave as expected

Some filters are marketplace-specific. Build the filter manually on Amazon, copy the resulting URL, and use it in `start_urls`. This preserves Amazon's own filter query.

#### Prices are missing for some products

Amazon search result cards do not always show prices. Missing prices can happen when products are unavailable, have multiple offers, require variant selection, are sponsored widgets with reduced markup, or are rendered differently for the request location.

#### Sponsored detection is not perfect

The Actor detects common sponsored labels in multiple languages and Amazon markup variants, but Amazon changes labels and layouts frequently. Use `is_sponsored`, `sponsored_count`, and `organic_count` as strong practical signals, not legal-grade classification.

### FAQ

#### Does this Actor scrape product detail pages?

No. This Actor scrapes Amazon search/result pages and returns product cards from those pages. For deep product detail data such as full bullet points, descriptions, seller details, variations, and buy box diagnostics, use a product detail page scraper.

#### Does it scrape reviews?

No. It extracts rating and review count when visible in the search result card, but it does not open review pages or collect individual reviews.

#### Does it use a browser?

No. It uses HTTP requests through `httpx` and parses HTML with `selectolax`. This keeps runs lightweight and fast, but JavaScript-only page states may not be available.

#### Can I scrape multiple countries in one run?

Yes. Pass multiple country codes in `countries`. The Actor creates one request for every keyword, country, and page combination.

#### Can I scrape pages 2, 3, and 4 from an existing URL?

Yes. Put the filtered Amazon URL in `start_urls` and set `number_of_pages` to `3`. If the URL already contains `page=2`, the Actor starts from page 2 and expands to pages 2, 3, and 4.

#### Can I pass raw Amazon sort values?

The input schema exposes supported `sort_by` options. For advanced or marketplace-specific sort behavior, use `start_urls` with the exact Amazon URL or pass query parameters through `extra_query`.

#### Is it legal to scrape Amazon?

Scraping publicly available web pages may be legal in many contexts, but you are responsible for how you use this Actor. Review Amazon's Terms of Service, applicable laws, privacy rules, and any contractual obligations before scraping or storing data. Do not scrape personal or sensitive data unless you have a lawful basis and permission where required.

#### Where can I report problems or request changes?

Use the Actor's **Issues** tab on Apify to report bugs, missing fields, marketplace-specific issues, or feature requests. Include the input, marketplace, URL, and a small sample of the output whenever possible.

# Actor input Schema

## `start_urls` (type: `array`):

Optional Amazon search/result page URLs. If provided, the scraper expands each URL using Number of pages to scrape, e.g. page 1 through page N.

## `search_terms` (type: `array`):

Keywords to search on Amazon. One page JSON item is produced for every keyword/country/page combination.

## `countries` (type: `array`):

Amazon marketplaces by ISO country code. Supported: US, CA, MX, BR, GB, DE, FR, IT, ES, NL, SE, PL, TR, AE, SA, EG, IN, JP, SG, AU.

## `number_of_pages` (type: `integer`):

How many Amazon result pages to request for each search term and country.

## `sort_by` (type: `string`):

Amazon search sort option. Raw Amazon sort values can be passed with extra\_query if needed.

## `department` (type: `string`):

Optional Amazon department/search index, for example electronics, fashion, computers, beauty, grocery. Availability varies by country.

## `min_price` (type: `string`):

Optional Amazon low-price query value. Use the marketplace's visible price format, e.g. 10 or 10.50.

## `max_price` (type: `string`):

Optional Amazon high-price query value. Use the marketplace's visible price format, e.g. 100 or 100.00.

## `rh` (type: `string`):

Optional raw Amazon rh filter, copied from a filtered Amazon URL. Useful for filters Amazon exposes only as marketplace-specific IDs.

## `extra_query` (type: `object`):

Optional extra URL query parameters to add to generated search URLs, e.g. {"p\_72":"1318476031"}. Use for Amazon filters that are possible but marketplace-specific.

## `use_proxy` (type: `boolean`):

Try to use Apify Residential Proxy for Amazon requests. If the run does not have proxy permission, the scraper continues without proxy unless Proxy required is enabled.

## `proxy_required` (type: `boolean`):

Fail/skip requests when Apify proxy is unavailable. Enable this only when you prefer no data over direct requests.

## `proxy_country` (type: `string`):

Optional ISO country code for the Apify Residential Proxy. Leave empty to auto-select from each Amazon marketplace country.

## `max_retries` (type: `integer`):

Maximum retries per Amazon page.

## `max_concurrency` (type: `integer`):

How many Amazon pages to scrape in parallel.

## Actor input object example

```json
{
  "start_urls": [
    {
      "url": "https://www.amazon.it/s?k=phone"
    }
  ],
  "search_terms": [
    "phone",
    "3d printer"
  ],
  "countries": [
    "IT",
    "US",
    "DE"
  ],
  "number_of_pages": 1,
  "sort_by": "relevance",
  "department": "",
  "min_price": "",
  "max_price": "",
  "rh": "",
  "extra_query": {},
  "use_proxy": false,
  "proxy_required": false,
  "proxy_country": "",
  "max_retries": 5,
  "max_concurrency": 3
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "start_urls": [
        {
            "url": "https://www.amazon.it/s?k=phone"
        }
    ],
    "search_terms": [
        "phone",
        "3d printer"
    ],
    "countries": [
        "IT",
        "US",
        "DE"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("devlory/amazonpagescraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "start_urls": [{ "url": "https://www.amazon.it/s?k=phone" }],
    "search_terms": [
        "phone",
        "3d printer",
    ],
    "countries": [
        "IT",
        "US",
        "DE",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("devlory/amazonpagescraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "start_urls": [
    {
      "url": "https://www.amazon.it/s?k=phone"
    }
  ],
  "search_terms": [
    "phone",
    "3d printer"
  ],
  "countries": [
    "IT",
    "US",
    "DE"
  ]
}' |
apify call devlory/amazonpagescraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=devlory/amazonpagescraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "AmazonPageScraper",
        "description": "The best amazon page scraper!\nUse this scraper to get all the products on an Amazon page!",
        "version": "0.0",
        "x-build-id": "7WPB3yfM4EvPuFgr9"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/devlory~amazonpagescraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-devlory-amazonpagescraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/devlory~amazonpagescraper/runs": {
            "post": {
                "operationId": "runs-sync-devlory-amazonpagescraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/devlory~amazonpagescraper/run-sync": {
            "post": {
                "operationId": "run-sync-devlory-amazonpagescraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "start_urls": {
                        "title": "Existing Amazon search/result URLs",
                        "type": "array",
                        "description": "Optional Amazon search/result page URLs. If provided, the scraper expands each URL using Number of pages to scrape, e.g. page 1 through page N.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "search_terms": {
                        "title": "Search terms",
                        "type": "array",
                        "description": "Keywords to search on Amazon. One page JSON item is produced for every keyword/country/page combination.",
                        "default": [
                            "phone"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "countries": {
                        "title": "Countries",
                        "type": "array",
                        "description": "Amazon marketplaces by ISO country code. Supported: US, CA, MX, BR, GB, DE, FR, IT, ES, NL, SE, PL, TR, AE, SA, EG, IN, JP, SG, AU.",
                        "default": [
                            "IT"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "number_of_pages": {
                        "title": "Number of pages to scrape",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "How many Amazon result pages to request for each search term and country.",
                        "default": 1
                    },
                    "sort_by": {
                        "title": "Sort by",
                        "enum": [
                            "relevance",
                            "featured",
                            "price_low_to_high",
                            "price_high_to_low",
                            "average_customer_review",
                            "newest_arrivals",
                            "best_sellers"
                        ],
                        "type": "string",
                        "description": "Amazon search sort option. Raw Amazon sort values can be passed with extra_query if needed.",
                        "default": "relevance"
                    },
                    "department": {
                        "title": "Department",
                        "type": "string",
                        "description": "Optional Amazon department/search index, for example electronics, fashion, computers, beauty, grocery. Availability varies by country.",
                        "default": ""
                    },
                    "min_price": {
                        "title": "Minimum price",
                        "type": "string",
                        "description": "Optional Amazon low-price query value. Use the marketplace's visible price format, e.g. 10 or 10.50.",
                        "default": ""
                    },
                    "max_price": {
                        "title": "Maximum price",
                        "type": "string",
                        "description": "Optional Amazon high-price query value. Use the marketplace's visible price format, e.g. 100 or 100.00.",
                        "default": ""
                    },
                    "rh": {
                        "title": "Raw Amazon filter query (rh)",
                        "type": "string",
                        "description": "Optional raw Amazon rh filter, copied from a filtered Amazon URL. Useful for filters Amazon exposes only as marketplace-specific IDs.",
                        "default": ""
                    },
                    "extra_query": {
                        "title": "Extra query parameters",
                        "type": "object",
                        "description": "Optional extra URL query parameters to add to generated search URLs, e.g. {\"p_72\":\"1318476031\"}. Use for Amazon filters that are possible but marketplace-specific.",
                        "default": {}
                    },
                    "use_proxy": {
                        "title": "Use Apify residential proxy if available",
                        "type": "boolean",
                        "description": "Try to use Apify Residential Proxy for Amazon requests. If the run does not have proxy permission, the scraper continues without proxy unless Proxy required is enabled.",
                        "default": false
                    },
                    "proxy_required": {
                        "title": "Proxy required",
                        "type": "boolean",
                        "description": "Fail/skip requests when Apify proxy is unavailable. Enable this only when you prefer no data over direct requests.",
                        "default": false
                    },
                    "proxy_country": {
                        "title": "Proxy country override",
                        "type": "string",
                        "description": "Optional ISO country code for the Apify Residential Proxy. Leave empty to auto-select from each Amazon marketplace country.",
                        "default": ""
                    },
                    "max_retries": {
                        "title": "Max retries",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Maximum retries per Amazon page.",
                        "default": 5
                    },
                    "max_concurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "How many Amazon pages to scrape in parallel.",
                        "default": 3
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
