Amazon  Scraper avatar

Amazon Scraper

Pricing

$19.00/month + usage

Go to Apify Store
Amazon  Scraper

Amazon Scraper

🛍️ Amazon Search Scraper — Collect real-time product data from Amazon by just entering keywords 🔎 or product URLs 🔗! Get title, price, ratings ⭐, stock info & images 🖼️ in clean structured format. Perfect for price tracking 💰, market research 📊 & competitor analysis 🚀.

Pricing

$19.00/month + usage

Rating

5.0

(2)

Developer

Neuro Scraper

Neuro Scraper

Maintained by Community

Actor stats

0

Bookmarked

72

Total users

2

Monthly active users

5 months ago

Last modified

Share

🎯 Amazon Search Keywords and products Scraper

Effortlessly extract structured product data from Amazon search results or directly from product URLs — including pricing, ratings, availability, and product metadata.


📖 Summary

This Apify Actor can extract data from Amazon in two ways:

  1. By providing search keywords (it collects all products listed in search results).
  2. By providing product URLs (it fetches details directly from each page).

Structured data is stored in the default Dataset.


💡 Use cases

  • 🛍️ E-commerce price monitoring and comparison
  • 📊 Market trend and keyword research
  • 🔍 Product catalog enrichment for Amazon listings
  • 🧠 Competitor intelligence automation

⚡ Quick Start (Apify Console)

  1. Open this Actor in the Apify Console.
  2. Click RunInput tab.
  3. Paste JSON input such as:
{
"queries": ["wireless earbuds", "gaming mouse"],
"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],
"concurrency": 8
}
  1. Optionally configure a proxy (see 🌍 Proxy Configuration below).
  2. Click Run — data will appear in the default Dataset.

⚡ Quick Start (CLI + API)

CLI

$apify run <ACTOR_ID> -p input.json

Where input.json contains:

{
"queries": ["laptop stand"],
"urls": ["https://www.amazon.com/dp/B0D1234XYZ"],
"concurrency": 5
}

API (Python)

from apify_client import ApifyClient
client = ApifyClient('<APIFY_TOKEN>')
run = client.actor('username~amazon-search-scraper').call(run_input={
'queries': ['mechanical keyboard'],
'urls': ['https://www.amazon.com/dp/B0D5678ABC'],
'concurrency': 8
})
print(run['defaultDatasetId'])

📝 Inputs

🔑 Name📝 Type❓ Required⚙️ Default📌 Example📝 Notes
queriesarray / string❌ Nonull["wireless earbuds"]Amazon search keywords to collect listings
urlsarray / string❌ Nonull["https://www.amazon.com/dp/B0D1234XYZ"]Direct product page URLs
concurrencyinteger❌ No85Max concurrent product fetch tasks
proxyConfigobject⚙️ Optional{ "useApifyProxy": true }{ "useApifyProxy": true }Configure proxy (see below)

💡 Example: Paste into Console input editor:

{"urls": ["https://www.amazon.com/dp/B0D5678ABC"], "concurrency": 4}

⚙️ Configuration

🔑 Name📝 Type❓ Required⚙️ Default📌 Example📝 Notes
OUTPUT_FILEstring❌ Noamazon.search.result.jsonoutput.jsonInternal output file for backup
REQUEST_TIMEOUTinteger❌ No3045Timeout in seconds per request
APIFY_TOKENstring✅ Yes<APIFY_TOKEN>Required for Apify client/API use

📤 Outputs

Results are stored in the default Dataset.

Example Output Item

{
"asin": "B0D1234XYZ",
"title": "Wireless Earbuds with Noise Cancellation",
"url": "https://www.amazon.com/dp/B0D1234XYZ",
"price": "$49.99",
"currency": "$",
"brand_name": "SoundMagic",
"availability": "In Stock",
"stars": 4.5,
"number_of_reviews_text": "1,234 ratings",
"categories": "Electronics > Audio > Headphones",
"images": ["https://m.media-amazon.com/images/I/xyz.jpg"]
}

🔑 Environment variables

NameDescription
APIFY_TOKENYour Apify API token for running via CLI or client
HTTP_PROXY(Optional) Custom HTTP proxy endpoint
HTTPS_PROXY(Optional) Custom HTTPS proxy endpoint

▶️ How to Run

Apify Console

  1. Go to Actor → Run.
  2. Paste JSON input containing either queries or urls.
  3. Enable proxy under Proxy tab (recommended).
  4. Click Start and monitor logs.

Apify CLI

$apify call username~amazon-search-scraper -p input.json

Apify Client (Python)

See Quick Start (API) example above.


⏰ Scheduling & Webhooks

  • Use the Schedule tab in the Apify Console to run daily/weekly.
  • Add a Webhook under the Webhooks tab to trigger external automation (e.g., send results to Slack or Google Sheets).

🐞 Logs & Troubleshooting

IssueCauseFix
Empty resultsAmazon blocked requestEnable Apify Proxy or rotate proxies
Timeout errorsNetwork latency or blockingIncrease REQUEST_TIMEOUT or reduce concurrency
Missing product detailsPage layout changedReport issue or rerun after 24h

🔒 Permissions & Storage

  • Uses the default Dataset for structured data.
  • Temporary files saved in Actor local storage.
  • Secure credentials (tokens, proxies) should be stored as Secrets in the Apify Console.

🆕 Changelog / Versioning

  • v1.1.0 — Added support for scraping from direct product URLs.
  • v1.0.0 — Initial public release.

📌 Notes / TODOs

  • TODO: Confirm supported Amazon domains (currently assumes amazon.com).
  • TODO: Add optional input for country_code or domain selection.

🌍 Proxy Configuration

Because this Actor sends requests to Amazon, proxy use is highly recommended.

  1. Open the Run page → Proxy tab.
  2. Check Use Apify Proxy.
  3. Select a proxy group (e.g., RESIDENTIAL or SHADER).

Custom Proxy Configuration

If you prefer your own proxy, go to Actor → Settings → Environment variables and set:

HTTP_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>
HTTPS_PROXY=http://<PROXY_USER>:<PROXY_PASS>@<HOST>:<PORT>

🔒 Always store proxy credentials securely as Secrets.

TODO

Implement proxy rotation per request for improved anti-blocking resilience.


📚 References


🧐 What I inferred from main.py

  • Actor collects Amazon product listings via search keywords and direct product URLs.
  • Network activity detected — proxy section included.
  • Outputs JSON list of structured product data.
  • Domain is assumed to be amazon.com — marked as TODO for domain parameterization.