AmazonPageScraper
Pricing
from $25.00 / 1,000 results
AmazonPageScraper
The best amazon page scraper! Use this scraper to get all the products on an Amazon page!
Pricing
from $25.00 / 1,000 results
Rating
0.0
(0)
Developer
Lorenzo Cerqua
Maintained by CommunityActor stats
2
Bookmarked
52
Total users
1
Monthly active users
9 days ago
Last modified
Categories
Share
Amazon Pages Search Scraper
What does Amazon Pages Search Scraper do?
Amazon Pages Search Scraper extracts structured data from Amazon search result pages across multiple marketplaces, keywords, pages, sort modes, departments, price ranges, and raw Amazon filters. It returns one clean JSON record per Amazon search/result page, including the products shown on that page, sponsored vs. organic counts, pagination details, breadcrumbs, related searches, meta tags, marketplace information, and request diagnostics.
Use it when you need to monitor Amazon SERPs, research competitors, compare product visibility across countries, collect search-page product lists, analyze sponsored placements, or build repeatable Amazon keyword research workflows on Apify.
The Actor uses lightweight HTTP requests instead of browser automation, making it fast and cost-efficient for search page extraction while still returning rich product-level data from each results page.
Why use Amazon Pages Search Scraper?
Amazon search pages change depending on country, language, search term, department, filters, sort order, price range, proxy location, and sponsored inventory. This Actor is built to make those variables explicit in the output, so you can understand what Amazon returned for each requested page.
Use it for:
- Amazon keyword research by scraping search result pages for one or many keywords.
- Marketplace comparison across Amazon domains such as
amazon.it,amazon.com,amazon.de,amazon.fr,amazon.co.uk,amazon.es,amazon.co.jp, and more. - SERP monitoring for ranking position, sponsored placement, Prime visibility, badges, coupons, prices, ratings, and review counts.
- Competitor discovery by collecting products shown for commercial search terms.
- Sponsored vs. organic analysis using
sponsored_count,organic_count, and product-levelis_sponsored. - Price-filtered research with Amazon's
low-priceandhigh-pricequery parameters. - Advanced Amazon filtering by passing raw
rhfilters or marketplace-specificp_*query parameters copied from Amazon URLs. - Data pipelines that need repeatable JSON output through Apify API, schedules, webhooks, integrations, or dataset exports.
How to use Amazon Pages Search Scraper
- Open the Actor in Apify Console.
- Choose whether to scrape by search terms or by existing Amazon search/result URLs.
- Add one or more keywords in
search_terms, for examplephone,3d printer, orwireless earbuds. - Select one or more Amazon marketplaces in
countries, for exampleIT,US,DE, orGB. - Set
number_of_pagesto control how many result pages are requested per keyword and country. - Optionally choose a sort mode, department, price range, raw
rhfilter, or extra query parameters. - Enable
use_proxyif you want the Actor to try Apify Residential Proxy for Amazon requests. - Start the Actor.
- Download the dataset as JSON, CSV, Excel, HTML, or another Apify-supported format.
You can also provide existing Amazon result URLs through start_urls. This is useful when you already built the exact Amazon filter combination in your browser and want the Actor to scrape it repeatedly.
Input
Configure the Actor from the Input tab.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
start_urls | Array of URLs | No | Empty | Existing Amazon search/result page URLs. If provided, these take priority over generated keyword searches. Each URL is expanded using number_of_pages. |
search_terms | Array of strings | No | ["phone"] | Keywords to search on Amazon. One page record is produced for every keyword, country, and page combination. |
countries | Array of strings | No | ["IT"] | Amazon marketplaces by ISO country code. Supported values are listed below. |
number_of_pages | Integer | No | 1 | Number of Amazon result pages to request for each keyword/country combination or each start URL. Minimum 1, maximum 20 in the Apify input schema. |
sort_by | String | No | relevance | Search sort option. Supported values: relevance, featured, price_low_to_high, price_high_to_low, average_customer_review, newest_arrivals, best_sellers. |
department | String | No | Empty | Optional Amazon search index, for example electronics, computers, beauty, fashion, or grocery. Availability varies by marketplace. |
min_price | String | No | Empty | Optional Amazon low-price query value, for example 10 or 10.50. |
max_price | String | No | Empty | Optional Amazon high-price query value, for example 100 or 100.00. |
rh | String | No | Empty | Raw Amazon rh filter copied from a filtered Amazon URL. Useful for marketplace-specific filters. |
extra_query | Object | No | {} | Extra URL query parameters added to generated search URLs, for example {"p_72":"1318476031"}. |
use_proxy | Boolean | No | false | Tries to use Apify Residential Proxy. If proxy permission is unavailable, the Actor continues without proxy unless proxy_required is enabled. |
proxy_required | Boolean | No | false | Skips requests when Apify proxy is unavailable. Enable only when you prefer no data over direct requests. |
proxy_country | String | No | Empty | Optional proxy country override. Leave empty to auto-select from each Amazon marketplace country. |
max_retries | Integer | No | 5 | Maximum retry attempts per Amazon page. |
max_concurrency | Integer | No | 3 | Number of Amazon pages processed in parallel. Lower values are gentler and often more stable for Amazon. |
Supported Amazon marketplaces
| Country code | Marketplace | Currency | Language hint |
|---|---|---|---|
US | amazon.com | USD | English, United States |
CA | amazon.ca | CAD | English/French, Canada |
MX | amazon.com.mx | MXN | Spanish, Mexico |
BR | amazon.com.br | BRL | Portuguese, Brazil |
GB | amazon.co.uk | GBP | English, United Kingdom |
DE | amazon.de | EUR | German |
FR | amazon.fr | EUR | French |
IT | amazon.it | EUR | Italian |
ES | amazon.es | EUR | Spanish, Spain |
NL | amazon.nl | EUR | Dutch |
SE | amazon.se | SEK | Swedish |
PL | amazon.pl | PLN | Polish |
TR | amazon.com.tr | TRY | Turkish |
AE | amazon.ae | AED | English/Arabic, UAE |
SA | amazon.sa | SAR | Arabic/English, Saudi Arabia |
EG | amazon.eg | EGP | Arabic/English, Egypt |
IN | amazon.in | INR | English/Hindi, India |
JP | amazon.co.jp | JPY | Japanese |
SG | amazon.sg | SGD | English, Singapore |
AU | amazon.com.au | AUD | English, Australia |
Example inputs
Keyword search across multiple countries
{"search_terms": ["phone", "3d printer"],"countries": ["IT", "US", "DE"],"number_of_pages": 2,"sort_by": "price_low_to_high","min_price": "10","max_price": "300","use_proxy": true,"proxy_required": false,"max_retries": 5,"max_concurrency": 3}
This input requests:
phoneon Amazon Italy, United States, and Germany, pages 1-2.3d printeron Amazon Italy, United States, and Germany, pages 1-2.- A total of
2 keywords x 3 countries x 2 pages = 12page records.
Existing filtered Amazon URL
{"start_urls": [{"url": "https://www.amazon.it/s?k=filamento+pla&rh=p_72%3A1318476031&s=review-rank"}],"number_of_pages": 3,"use_proxy": true}
When start_urls is provided, the Actor keeps the query parameters from the URL and expands only the page parameter. The example above requests pages 1-3 of the same filtered search.
Generated search with raw Amazon filters
{"search_terms": ["wireless earbuds"],"countries": ["US"],"number_of_pages": 1,"sort_by": "average_customer_review","rh": "p_72:1248879011","extra_query": {"p_36": "1253503011"}}
Use rh and extra_query when Amazon exposes useful filters as marketplace-specific IDs. The easiest workflow is to apply filters manually on Amazon, copy the resulting URL, and reuse its rh or p_* parameters.
Output
The Actor stores one JSON object per scraped Amazon search/result page in the default Apify dataset. Each page item contains page-level metadata and a nested products array.
Simplified output example:
{"scraped_at": "2026-06-17T10:30:00.000000+00:00","country": "IT","marketplace": "amazon.it","currency": "EUR","source_url": "https://www.amazon.it/s?k=phone","final_url": "https://www.amazon.it/s?k=phone","blocked_or_captcha_detected": false,"search_term": "phone","page_number": 1,"requested_pages": 2,"sort_by": "relevance","filters": {"search_term": "phone","page": 1,"raw_query": {"k": "phone"}},"title": "Amazon.it : phone","result_count_text": "1-48 of over 10,000 results for phone","result_count_estimate": 10000,"products_count": 48,"sponsored_count": 8,"organic_count": 40,"products": [{"position": 1,"asin": "B0EXAMPLE1","title": "Example Smartphone 128GB","url": "https://www.amazon.it/dp/B0EXAMPLE1","image": {"url": "https://m.media-amazon.com/images/I/example.jpg","alt": "Example Smartphone 128GB"},"price": {"current": "199,99 EUR","raw": "199,99 EUR","symbol": "EUR"},"rating": {"text": "4.5 out of 5 stars","value": 4.5},"reviews": {"text": "1,234","count": 1234},"badges": ["Amazon's Choice"],"coupon": "Save 10%","delivery": {"messages": ["FREE delivery Tomorrow with Prime"]},"is_sponsored": true,"is_prime": true,"is_amazon_choice": true,"is_best_seller": false}],"pagination": {"current_page": 1,"known_pages": [1, 2, 3],"last_known_page": 3,"next_page_url": "https://www.amazon.it/s?k=phone&page=2"},"breadcrumbs": ["Electronics", "Mobile Phones"],"related_searches": ["phone case", "smartphone", "iphone"],"meta": {"description": "Amazon search results"},"captured_request": {"body_length": 523841,"status": 200,"content_type": "text/html","proxy_used": true},"requested_proxy_country": "IT"}
Page-level data fields
| Field | Description |
|---|---|
scraped_at | UTC timestamp when the page was scraped. |
country | Amazon marketplace country code inferred from input or URL. |
marketplace | Amazon domain, such as amazon.it, amazon.com, or amazon.de. |
currency | Expected marketplace currency. |
source_url | URL requested by the Actor. |
final_url | Final URL after redirects. |
blocked_or_captcha_detected | Whether the returned HTML appears to be blocked or CAPTCHA-protected. |
search_term | Keyword from input or parsed from the URL query parameter k. |
page_number | Amazon result page number. |
requested_pages | Number of pages requested for the keyword/country or start URL. |
sort_by | Sort option requested in the Actor input when generated from keywords. |
filters | Parsed URL filters such as keyword, department, sort, page, price range, rh, and raw query parameters. |
title | HTML page title returned by Amazon. |
result_count_text | Raw result count text shown by Amazon when available. |
result_count_estimate | Best numeric estimate parsed from result_count_text. |
products_count | Number of product cards parsed from the page. |
sponsored_count | Number of parsed products detected as sponsored. |
organic_count | Number of parsed products not detected as sponsored. |
products | Product cards extracted from the search page. |
pagination | Current page, visible page numbers, last known page, and next page URL when visible. |
breadcrumbs | Category or department breadcrumb text detected on the page. |
related_searches | Related search suggestions detected on the page. |
meta | Page meta tags. |
captured_request | Request diagnostics such as response status, content type, body length, and whether proxy was used. |
requested_proxy_country | Proxy country selected or requested for the page. |
Product-level data fields
Each page record contains a products array. Each product object may include:
| Field | Description |
|---|---|
position | Product position among parsed result cards on that page. |
asin | Amazon Standard Identification Number detected from the result card. |
title | Product title shown in the search result. |
url | Absolute Amazon product URL. |
image | Product image URL, alt text, original srcset, and high-resolution candidates when available. |
price.current | Best visible current price detected in the result card. |
price.raw | Raw visible price text. |
price.symbol | Currency symbol when available. |
price.whole | Whole price component when Amazon renders split price markup. |
price.fraction | Fractional price component when Amazon renders split price markup. |
price.list_price | Crossed-out/list price when visible. |
rating.text | Raw rating text. |
rating.value | Numeric rating value parsed from the rating text. |
reviews.text | Raw review count text. |
reviews.count | Numeric review count. |
badges | Badges such as Amazon's Choice, Best Seller, or deal labels when visible. |
coupon | Coupon text when visible. |
delivery.messages | Delivery, shipping, Prime, and arrival messages detected in the card. |
is_sponsored | Whether the card appears sponsored. |
is_prime | Whether Prime appears on the card. |
is_amazon_choice | Whether Amazon's Choice appears on the card. |
is_best_seller | Whether Best Seller/Bestseller appears on the card. |
availability_text | Limited stock text when visible, such as "Only 3 left". |
data_component_type | Amazon result card component type attribute. |
data_uuid | Amazon card UUID attribute when available. |
raw_classes | Raw HTML classes from the result card, useful for debugging layout variants. |
Sorting options
sort_by value | Amazon query value | Meaning |
|---|---|---|
relevance | None | Amazon default relevance order. |
featured | relevanceblender | Featured/relevance blend. |
price_low_to_high | price-asc-rank | Sort by price ascending. |
price_high_to_low | price-desc-rank | Sort by price descending. |
average_customer_review | review-rank | Sort by average customer review. |
newest_arrivals | date-desc-rank | Sort by newest arrivals. |
best_sellers | exact-aware-popularity-rank | Sort by popularity/best sellers. |
Working with Amazon filters
Amazon uses a mix of portable query parameters and marketplace-specific filter IDs.
Portable filters generated by the Actor:
k: Search keyword.page: Result page number.s: Sort mode.i: Department/search index.low-price: Minimum price.high-price: Maximum price.rh: Raw filter expression copied from Amazon.
Marketplace-specific filters:
- Amazon often uses parameters like
p_72,p_36,p_n_feature_browse-bin, or encodedrhvalues. - These IDs can differ by marketplace, category, and language.
- The safest approach is to apply the filters manually on Amazon, copy the final URL, and either use it as
start_urlsor pass the relevant values throughrhandextra_query.
Example:
{"search_terms": ["filament pla"],"countries": ["IT"],"rh": "p_72:1318476031","extra_query": {"p_36": "1631630031"}}
Proxy behavior
Amazon pages can vary significantly based on visitor country and can occasionally return blocked pages. For better stability, enable use_proxy and use Apify Residential Proxy when available.
When proxy_country is empty, the Actor chooses a proxy country from the Amazon marketplace:
amazon.itusesITamazon.comusesUSamazon.co.ukusesGBamazon.deusesDEamazon.frusesFRamazon.esusesESamazon.co.jpusesJP- And similarly for the other supported marketplaces
If the Actor logs an Insufficient permissions proxy error, the run likely does not have access to Apify Proxy, often because of token permissions or a limited-permissions run. With use_proxy=true and proxy_required=false, the scraper tries the proxy first and then falls back to direct requests. Set proxy_required=true when direct requests are not acceptable.
Pricing and cost estimation
Costs depend on the number of pages requested, retries, concurrency, proxy usage, page size, and Amazon response stability. This Actor uses HTTP requests rather than browser automation, so it is generally more compute-efficient than Playwright or Puppeteer-based scraping.
Main cost drivers:
- More keywords, countries, and pages increase the total number of requested pages.
- Higher retries can improve completion rate but increase runtime.
- Residential proxies may add proxy usage cost but are recommended for Amazon stability.
- Higher concurrency may finish faster, but overly aggressive concurrency can produce more unstable Amazon responses.
- Large result pages increase dataset size because every page contains a nested
productsarray.
For testing, start with one keyword, one country, number_of_pages: 1, max_retries: 5, and max_concurrency: 1 or 2. Once the output looks stable, increase the scope gradually.
Local development
The Actor can also be run locally for smoke testing.
Install dependencies:
$pip install -r requirements.txt
Run a local keyword scrape:
$python -m src --keyword "phone" --country IT --max-pages 1 --output amazon_pages.json
Run multiple keywords and countries:
python -m src \--keyword "phone" \--keyword "3d printer" \--country IT \--country DE \--max-pages 2 \--sort-by price_low_to_high \--min-price 10 \--max-price 300 \--output amazon_pages.json
Run from an existing Amazon search URL:
python -m src \--url "https://www.amazon.it/s?k=filamento+pla&rh=p_72%3A1318476031" \--max-pages 2 \--output amazon_pages.json
Local mode writes a JSON file to the path passed with --output.
Tips and best practices
- Use
start_urlswhen you need exact Amazon filters that are difficult to reproduce manually. - Keep
max_concurrencymodest for Amazon. Values between1and3are usually a good starting point. - Enable
use_proxyfor more marketplace-consistent results. - Leave
proxy_countryempty when scraping mixed marketplaces so the Actor can pick the country from each Amazon domain. - Set
proxy_countrymanually when you need all requests to come from a single country. - Check
blocked_or_captcha_detectedbefore trusting a page with zero products. - Check
captured_request.proxy_usedandrequested_proxy_countrywhen results differ from your browser. - Use
result_count_estimateas an estimate only. Amazon's displayed result counts are often rounded, localized, or approximate. - Do not assume
positionequals absolute Amazon ranking across all pages. It is the parsed position among product cards on the current result page. - Sponsored products, carousels, editorial widgets, and layout variants may affect visible position and product counts.
- For recurring monitoring, use Apify schedules and compare datasets over time.
Troubleshooting
The dataset has no products for a page
Check blocked_or_captcha_detected, captured_request.status, and final_url. Amazon may have returned a CAPTCHA, an empty result page, a redirect, or a layout variant. Try enabling proxy, lowering concurrency, increasing retries, or using a more specific marketplace URL.
Results differ from my browser
Amazon personalizes search pages by country, language, cookies, delivery location, Prime state, availability, and A/B tests. Compare marketplace, requested_proxy_country, filters, and final_url. If you need a specific country view, use the matching marketplace and proxy country.
Proxy fails with insufficient permissions
Your Apify run may not have permission to use Apify Proxy. With proxy_required=false, the Actor falls back to direct requests. With proxy_required=true, requests are skipped when proxy configuration is unavailable.
Amazon filters do not behave as expected
Some filters are marketplace-specific. Build the filter manually on Amazon, copy the resulting URL, and use it in start_urls. This preserves Amazon's own filter query.
Prices are missing for some products
Amazon search result cards do not always show prices. Missing prices can happen when products are unavailable, have multiple offers, require variant selection, are sponsored widgets with reduced markup, or are rendered differently for the request location.
Sponsored detection is not perfect
The Actor detects common sponsored labels in multiple languages and Amazon markup variants, but Amazon changes labels and layouts frequently. Use is_sponsored, sponsored_count, and organic_count as strong practical signals, not legal-grade classification.
FAQ
Does this Actor scrape product detail pages?
No. This Actor scrapes Amazon search/result pages and returns product cards from those pages. For deep product detail data such as full bullet points, descriptions, seller details, variations, and buy box diagnostics, use a product detail page scraper.
Does it scrape reviews?
No. It extracts rating and review count when visible in the search result card, but it does not open review pages or collect individual reviews.
Does it use a browser?
No. It uses HTTP requests through httpx and parses HTML with selectolax. This keeps runs lightweight and fast, but JavaScript-only page states may not be available.
Can I scrape multiple countries in one run?
Yes. Pass multiple country codes in countries. The Actor creates one request for every keyword, country, and page combination.
Can I scrape pages 2, 3, and 4 from an existing URL?
Yes. Put the filtered Amazon URL in start_urls and set number_of_pages to 3. If the URL already contains page=2, the Actor starts from page 2 and expands to pages 2, 3, and 4.
Can I pass raw Amazon sort values?
The input schema exposes supported sort_by options. For advanced or marketplace-specific sort behavior, use start_urls with the exact Amazon URL or pass query parameters through extra_query.
Is it legal to scrape Amazon?
Scraping publicly available web pages may be legal in many contexts, but you are responsible for how you use this Actor. Review Amazon's Terms of Service, applicable laws, privacy rules, and any contractual obligations before scraping or storing data. Do not scrape personal or sensitive data unless you have a lawful basis and permission where required.
Where can I report problems or request changes?
Use the Actor's Issues tab on Apify to report bugs, missing fields, marketplace-specific issues, or feature requests. Include the input, marketplace, URL, and a small sample of the output whenever possible.


