AmazonPageScraper avatar

AmazonPageScraper

Pricing

from $25.00 / 1,000 results

Go to Apify Store
AmazonPageScraper

AmazonPageScraper

The best amazon page scraper! Use this scraper to get all the products on an Amazon page!

Pricing

from $25.00 / 1,000 results

Rating

0.0

(0)

Developer

Lorenzo Cerqua

Lorenzo Cerqua

Maintained by Community

Actor stats

2

Bookmarked

52

Total users

1

Monthly active users

9 days ago

Last modified

Share

Amazon Pages Search Scraper

What does Amazon Pages Search Scraper do?

Amazon Pages Search Scraper extracts structured data from Amazon search result pages across multiple marketplaces, keywords, pages, sort modes, departments, price ranges, and raw Amazon filters. It returns one clean JSON record per Amazon search/result page, including the products shown on that page, sponsored vs. organic counts, pagination details, breadcrumbs, related searches, meta tags, marketplace information, and request diagnostics.

Use it when you need to monitor Amazon SERPs, research competitors, compare product visibility across countries, collect search-page product lists, analyze sponsored placements, or build repeatable Amazon keyword research workflows on Apify.

The Actor uses lightweight HTTP requests instead of browser automation, making it fast and cost-efficient for search page extraction while still returning rich product-level data from each results page.

Why use Amazon Pages Search Scraper?

Amazon search pages change depending on country, language, search term, department, filters, sort order, price range, proxy location, and sponsored inventory. This Actor is built to make those variables explicit in the output, so you can understand what Amazon returned for each requested page.

Use it for:

  • Amazon keyword research by scraping search result pages for one or many keywords.
  • Marketplace comparison across Amazon domains such as amazon.it, amazon.com, amazon.de, amazon.fr, amazon.co.uk, amazon.es, amazon.co.jp, and more.
  • SERP monitoring for ranking position, sponsored placement, Prime visibility, badges, coupons, prices, ratings, and review counts.
  • Competitor discovery by collecting products shown for commercial search terms.
  • Sponsored vs. organic analysis using sponsored_count, organic_count, and product-level is_sponsored.
  • Price-filtered research with Amazon's low-price and high-price query parameters.
  • Advanced Amazon filtering by passing raw rh filters or marketplace-specific p_* query parameters copied from Amazon URLs.
  • Data pipelines that need repeatable JSON output through Apify API, schedules, webhooks, integrations, or dataset exports.

How to use Amazon Pages Search Scraper

  1. Open the Actor in Apify Console.
  2. Choose whether to scrape by search terms or by existing Amazon search/result URLs.
  3. Add one or more keywords in search_terms, for example phone, 3d printer, or wireless earbuds.
  4. Select one or more Amazon marketplaces in countries, for example IT, US, DE, or GB.
  5. Set number_of_pages to control how many result pages are requested per keyword and country.
  6. Optionally choose a sort mode, department, price range, raw rh filter, or extra query parameters.
  7. Enable use_proxy if you want the Actor to try Apify Residential Proxy for Amazon requests.
  8. Start the Actor.
  9. Download the dataset as JSON, CSV, Excel, HTML, or another Apify-supported format.

You can also provide existing Amazon result URLs through start_urls. This is useful when you already built the exact Amazon filter combination in your browser and want the Actor to scrape it repeatedly.

Input

Configure the Actor from the Input tab.

FieldTypeRequiredDefaultDescription
start_urlsArray of URLsNoEmptyExisting Amazon search/result page URLs. If provided, these take priority over generated keyword searches. Each URL is expanded using number_of_pages.
search_termsArray of stringsNo["phone"]Keywords to search on Amazon. One page record is produced for every keyword, country, and page combination.
countriesArray of stringsNo["IT"]Amazon marketplaces by ISO country code. Supported values are listed below.
number_of_pagesIntegerNo1Number of Amazon result pages to request for each keyword/country combination or each start URL. Minimum 1, maximum 20 in the Apify input schema.
sort_byStringNorelevanceSearch sort option. Supported values: relevance, featured, price_low_to_high, price_high_to_low, average_customer_review, newest_arrivals, best_sellers.
departmentStringNoEmptyOptional Amazon search index, for example electronics, computers, beauty, fashion, or grocery. Availability varies by marketplace.
min_priceStringNoEmptyOptional Amazon low-price query value, for example 10 or 10.50.
max_priceStringNoEmptyOptional Amazon high-price query value, for example 100 or 100.00.
rhStringNoEmptyRaw Amazon rh filter copied from a filtered Amazon URL. Useful for marketplace-specific filters.
extra_queryObjectNo{}Extra URL query parameters added to generated search URLs, for example {"p_72":"1318476031"}.
use_proxyBooleanNofalseTries to use Apify Residential Proxy. If proxy permission is unavailable, the Actor continues without proxy unless proxy_required is enabled.
proxy_requiredBooleanNofalseSkips requests when Apify proxy is unavailable. Enable only when you prefer no data over direct requests.
proxy_countryStringNoEmptyOptional proxy country override. Leave empty to auto-select from each Amazon marketplace country.
max_retriesIntegerNo5Maximum retry attempts per Amazon page.
max_concurrencyIntegerNo3Number of Amazon pages processed in parallel. Lower values are gentler and often more stable for Amazon.

Supported Amazon marketplaces

Country codeMarketplaceCurrencyLanguage hint
USamazon.comUSDEnglish, United States
CAamazon.caCADEnglish/French, Canada
MXamazon.com.mxMXNSpanish, Mexico
BRamazon.com.brBRLPortuguese, Brazil
GBamazon.co.ukGBPEnglish, United Kingdom
DEamazon.deEURGerman
FRamazon.frEURFrench
ITamazon.itEURItalian
ESamazon.esEURSpanish, Spain
NLamazon.nlEURDutch
SEamazon.seSEKSwedish
PLamazon.plPLNPolish
TRamazon.com.trTRYTurkish
AEamazon.aeAEDEnglish/Arabic, UAE
SAamazon.saSARArabic/English, Saudi Arabia
EGamazon.egEGPArabic/English, Egypt
INamazon.inINREnglish/Hindi, India
JPamazon.co.jpJPYJapanese
SGamazon.sgSGDEnglish, Singapore
AUamazon.com.auAUDEnglish, Australia

Example inputs

Keyword search across multiple countries

{
"search_terms": ["phone", "3d printer"],
"countries": ["IT", "US", "DE"],
"number_of_pages": 2,
"sort_by": "price_low_to_high",
"min_price": "10",
"max_price": "300",
"use_proxy": true,
"proxy_required": false,
"max_retries": 5,
"max_concurrency": 3
}

This input requests:

  • phone on Amazon Italy, United States, and Germany, pages 1-2.
  • 3d printer on Amazon Italy, United States, and Germany, pages 1-2.
  • A total of 2 keywords x 3 countries x 2 pages = 12 page records.

Existing filtered Amazon URL

{
"start_urls": [
{
"url": "https://www.amazon.it/s?k=filamento+pla&rh=p_72%3A1318476031&s=review-rank"
}
],
"number_of_pages": 3,
"use_proxy": true
}

When start_urls is provided, the Actor keeps the query parameters from the URL and expands only the page parameter. The example above requests pages 1-3 of the same filtered search.

Generated search with raw Amazon filters

{
"search_terms": ["wireless earbuds"],
"countries": ["US"],
"number_of_pages": 1,
"sort_by": "average_customer_review",
"rh": "p_72:1248879011",
"extra_query": {
"p_36": "1253503011"
}
}

Use rh and extra_query when Amazon exposes useful filters as marketplace-specific IDs. The easiest workflow is to apply filters manually on Amazon, copy the resulting URL, and reuse its rh or p_* parameters.

Output

The Actor stores one JSON object per scraped Amazon search/result page in the default Apify dataset. Each page item contains page-level metadata and a nested products array.

Simplified output example:

{
"scraped_at": "2026-06-17T10:30:00.000000+00:00",
"country": "IT",
"marketplace": "amazon.it",
"currency": "EUR",
"source_url": "https://www.amazon.it/s?k=phone",
"final_url": "https://www.amazon.it/s?k=phone",
"blocked_or_captcha_detected": false,
"search_term": "phone",
"page_number": 1,
"requested_pages": 2,
"sort_by": "relevance",
"filters": {
"search_term": "phone",
"page": 1,
"raw_query": {
"k": "phone"
}
},
"title": "Amazon.it : phone",
"result_count_text": "1-48 of over 10,000 results for phone",
"result_count_estimate": 10000,
"products_count": 48,
"sponsored_count": 8,
"organic_count": 40,
"products": [
{
"position": 1,
"asin": "B0EXAMPLE1",
"title": "Example Smartphone 128GB",
"url": "https://www.amazon.it/dp/B0EXAMPLE1",
"image": {
"url": "https://m.media-amazon.com/images/I/example.jpg",
"alt": "Example Smartphone 128GB"
},
"price": {
"current": "199,99 EUR",
"raw": "199,99 EUR",
"symbol": "EUR"
},
"rating": {
"text": "4.5 out of 5 stars",
"value": 4.5
},
"reviews": {
"text": "1,234",
"count": 1234
},
"badges": ["Amazon's Choice"],
"coupon": "Save 10%",
"delivery": {
"messages": ["FREE delivery Tomorrow with Prime"]
},
"is_sponsored": true,
"is_prime": true,
"is_amazon_choice": true,
"is_best_seller": false
}
],
"pagination": {
"current_page": 1,
"known_pages": [1, 2, 3],
"last_known_page": 3,
"next_page_url": "https://www.amazon.it/s?k=phone&page=2"
},
"breadcrumbs": ["Electronics", "Mobile Phones"],
"related_searches": ["phone case", "smartphone", "iphone"],
"meta": {
"description": "Amazon search results"
},
"captured_request": {
"body_length": 523841,
"status": 200,
"content_type": "text/html",
"proxy_used": true
},
"requested_proxy_country": "IT"
}

Page-level data fields

FieldDescription
scraped_atUTC timestamp when the page was scraped.
countryAmazon marketplace country code inferred from input or URL.
marketplaceAmazon domain, such as amazon.it, amazon.com, or amazon.de.
currencyExpected marketplace currency.
source_urlURL requested by the Actor.
final_urlFinal URL after redirects.
blocked_or_captcha_detectedWhether the returned HTML appears to be blocked or CAPTCHA-protected.
search_termKeyword from input or parsed from the URL query parameter k.
page_numberAmazon result page number.
requested_pagesNumber of pages requested for the keyword/country or start URL.
sort_bySort option requested in the Actor input when generated from keywords.
filtersParsed URL filters such as keyword, department, sort, page, price range, rh, and raw query parameters.
titleHTML page title returned by Amazon.
result_count_textRaw result count text shown by Amazon when available.
result_count_estimateBest numeric estimate parsed from result_count_text.
products_countNumber of product cards parsed from the page.
sponsored_countNumber of parsed products detected as sponsored.
organic_countNumber of parsed products not detected as sponsored.
productsProduct cards extracted from the search page.
paginationCurrent page, visible page numbers, last known page, and next page URL when visible.
breadcrumbsCategory or department breadcrumb text detected on the page.
related_searchesRelated search suggestions detected on the page.
metaPage meta tags.
captured_requestRequest diagnostics such as response status, content type, body length, and whether proxy was used.
requested_proxy_countryProxy country selected or requested for the page.

Product-level data fields

Each page record contains a products array. Each product object may include:

FieldDescription
positionProduct position among parsed result cards on that page.
asinAmazon Standard Identification Number detected from the result card.
titleProduct title shown in the search result.
urlAbsolute Amazon product URL.
imageProduct image URL, alt text, original srcset, and high-resolution candidates when available.
price.currentBest visible current price detected in the result card.
price.rawRaw visible price text.
price.symbolCurrency symbol when available.
price.wholeWhole price component when Amazon renders split price markup.
price.fractionFractional price component when Amazon renders split price markup.
price.list_priceCrossed-out/list price when visible.
rating.textRaw rating text.
rating.valueNumeric rating value parsed from the rating text.
reviews.textRaw review count text.
reviews.countNumeric review count.
badgesBadges such as Amazon's Choice, Best Seller, or deal labels when visible.
couponCoupon text when visible.
delivery.messagesDelivery, shipping, Prime, and arrival messages detected in the card.
is_sponsoredWhether the card appears sponsored.
is_primeWhether Prime appears on the card.
is_amazon_choiceWhether Amazon's Choice appears on the card.
is_best_sellerWhether Best Seller/Bestseller appears on the card.
availability_textLimited stock text when visible, such as "Only 3 left".
data_component_typeAmazon result card component type attribute.
data_uuidAmazon card UUID attribute when available.
raw_classesRaw HTML classes from the result card, useful for debugging layout variants.

Sorting options

sort_by valueAmazon query valueMeaning
relevanceNoneAmazon default relevance order.
featuredrelevanceblenderFeatured/relevance blend.
price_low_to_highprice-asc-rankSort by price ascending.
price_high_to_lowprice-desc-rankSort by price descending.
average_customer_reviewreview-rankSort by average customer review.
newest_arrivalsdate-desc-rankSort by newest arrivals.
best_sellersexact-aware-popularity-rankSort by popularity/best sellers.

Working with Amazon filters

Amazon uses a mix of portable query parameters and marketplace-specific filter IDs.

Portable filters generated by the Actor:

  • k: Search keyword.
  • page: Result page number.
  • s: Sort mode.
  • i: Department/search index.
  • low-price: Minimum price.
  • high-price: Maximum price.
  • rh: Raw filter expression copied from Amazon.

Marketplace-specific filters:

  • Amazon often uses parameters like p_72, p_36, p_n_feature_browse-bin, or encoded rh values.
  • These IDs can differ by marketplace, category, and language.
  • The safest approach is to apply the filters manually on Amazon, copy the final URL, and either use it as start_urls or pass the relevant values through rh and extra_query.

Example:

{
"search_terms": ["filament pla"],
"countries": ["IT"],
"rh": "p_72:1318476031",
"extra_query": {
"p_36": "1631630031"
}
}

Proxy behavior

Amazon pages can vary significantly based on visitor country and can occasionally return blocked pages. For better stability, enable use_proxy and use Apify Residential Proxy when available.

When proxy_country is empty, the Actor chooses a proxy country from the Amazon marketplace:

  • amazon.it uses IT
  • amazon.com uses US
  • amazon.co.uk uses GB
  • amazon.de uses DE
  • amazon.fr uses FR
  • amazon.es uses ES
  • amazon.co.jp uses JP
  • And similarly for the other supported marketplaces

If the Actor logs an Insufficient permissions proxy error, the run likely does not have access to Apify Proxy, often because of token permissions or a limited-permissions run. With use_proxy=true and proxy_required=false, the scraper tries the proxy first and then falls back to direct requests. Set proxy_required=true when direct requests are not acceptable.

Pricing and cost estimation

Costs depend on the number of pages requested, retries, concurrency, proxy usage, page size, and Amazon response stability. This Actor uses HTTP requests rather than browser automation, so it is generally more compute-efficient than Playwright or Puppeteer-based scraping.

Main cost drivers:

  • More keywords, countries, and pages increase the total number of requested pages.
  • Higher retries can improve completion rate but increase runtime.
  • Residential proxies may add proxy usage cost but are recommended for Amazon stability.
  • Higher concurrency may finish faster, but overly aggressive concurrency can produce more unstable Amazon responses.
  • Large result pages increase dataset size because every page contains a nested products array.

For testing, start with one keyword, one country, number_of_pages: 1, max_retries: 5, and max_concurrency: 1 or 2. Once the output looks stable, increase the scope gradually.

Local development

The Actor can also be run locally for smoke testing.

Install dependencies:

$pip install -r requirements.txt

Run a local keyword scrape:

$python -m src --keyword "phone" --country IT --max-pages 1 --output amazon_pages.json

Run multiple keywords and countries:

python -m src \
--keyword "phone" \
--keyword "3d printer" \
--country IT \
--country DE \
--max-pages 2 \
--sort-by price_low_to_high \
--min-price 10 \
--max-price 300 \
--output amazon_pages.json

Run from an existing Amazon search URL:

python -m src \
--url "https://www.amazon.it/s?k=filamento+pla&rh=p_72%3A1318476031" \
--max-pages 2 \
--output amazon_pages.json

Local mode writes a JSON file to the path passed with --output.

Tips and best practices

  • Use start_urls when you need exact Amazon filters that are difficult to reproduce manually.
  • Keep max_concurrency modest for Amazon. Values between 1 and 3 are usually a good starting point.
  • Enable use_proxy for more marketplace-consistent results.
  • Leave proxy_country empty when scraping mixed marketplaces so the Actor can pick the country from each Amazon domain.
  • Set proxy_country manually when you need all requests to come from a single country.
  • Check blocked_or_captcha_detected before trusting a page with zero products.
  • Check captured_request.proxy_used and requested_proxy_country when results differ from your browser.
  • Use result_count_estimate as an estimate only. Amazon's displayed result counts are often rounded, localized, or approximate.
  • Do not assume position equals absolute Amazon ranking across all pages. It is the parsed position among product cards on the current result page.
  • Sponsored products, carousels, editorial widgets, and layout variants may affect visible position and product counts.
  • For recurring monitoring, use Apify schedules and compare datasets over time.

Troubleshooting

The dataset has no products for a page

Check blocked_or_captcha_detected, captured_request.status, and final_url. Amazon may have returned a CAPTCHA, an empty result page, a redirect, or a layout variant. Try enabling proxy, lowering concurrency, increasing retries, or using a more specific marketplace URL.

Results differ from my browser

Amazon personalizes search pages by country, language, cookies, delivery location, Prime state, availability, and A/B tests. Compare marketplace, requested_proxy_country, filters, and final_url. If you need a specific country view, use the matching marketplace and proxy country.

Proxy fails with insufficient permissions

Your Apify run may not have permission to use Apify Proxy. With proxy_required=false, the Actor falls back to direct requests. With proxy_required=true, requests are skipped when proxy configuration is unavailable.

Amazon filters do not behave as expected

Some filters are marketplace-specific. Build the filter manually on Amazon, copy the resulting URL, and use it in start_urls. This preserves Amazon's own filter query.

Prices are missing for some products

Amazon search result cards do not always show prices. Missing prices can happen when products are unavailable, have multiple offers, require variant selection, are sponsored widgets with reduced markup, or are rendered differently for the request location.

The Actor detects common sponsored labels in multiple languages and Amazon markup variants, but Amazon changes labels and layouts frequently. Use is_sponsored, sponsored_count, and organic_count as strong practical signals, not legal-grade classification.

FAQ

Does this Actor scrape product detail pages?

No. This Actor scrapes Amazon search/result pages and returns product cards from those pages. For deep product detail data such as full bullet points, descriptions, seller details, variations, and buy box diagnostics, use a product detail page scraper.

Does it scrape reviews?

No. It extracts rating and review count when visible in the search result card, but it does not open review pages or collect individual reviews.

Does it use a browser?

No. It uses HTTP requests through httpx and parses HTML with selectolax. This keeps runs lightweight and fast, but JavaScript-only page states may not be available.

Can I scrape multiple countries in one run?

Yes. Pass multiple country codes in countries. The Actor creates one request for every keyword, country, and page combination.

Can I scrape pages 2, 3, and 4 from an existing URL?

Yes. Put the filtered Amazon URL in start_urls and set number_of_pages to 3. If the URL already contains page=2, the Actor starts from page 2 and expands to pages 2, 3, and 4.

Can I pass raw Amazon sort values?

The input schema exposes supported sort_by options. For advanced or marketplace-specific sort behavior, use start_urls with the exact Amazon URL or pass query parameters through extra_query.

Scraping publicly available web pages may be legal in many contexts, but you are responsible for how you use this Actor. Review Amazon's Terms of Service, applicable laws, privacy rules, and any contractual obligations before scraping or storing data. Do not scrape personal or sensitive data unless you have a lawful basis and permission where required.

Where can I report problems or request changes?

Use the Actor's Issues tab on Apify to report bugs, missing fields, marketplace-specific issues, or feature requests. Include the input, marketplace, URL, and a small sample of the output whenever possible.