🛒 Walmart Data Extractor avatar

🛒 Walmart Data Extractor

Pricing

from $4.99 / 1,000 results

Go to Apify Store
🛒 Walmart Data Extractor

🛒 Walmart Data Extractor

🛒 Walmart Data Extractor pulls product details, pricing, ratings & availability from Walmart for fast market research. 📊 Automate leads, monitor competitors & track trends with reliable data. 🚀 Great for B2B insights & analytics.

Pricing

from $4.99 / 1,000 results

Rating

0.0

(0)

Developer

Scraper Engine

Scraper Engine

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract rich, structured product data from Walmart.com at scale. Feed it category pages, search pages, or product (/ip/) URLs — or just a keyword — and get back prices, images, brand, full specifications, ratings, seller info, and much more. Built for reliability with automatic proxy escalation, anti-bot browser impersonation, retries, and real-time dataset saving.

✨ Why Choose This Actor?

  • 🔗 Bulk URLs — mix category, search and product URLs in a single run.
  • 🛡️ Smart proxy escalation — starts direct, falls back to datacenter, then residential automatically, and sticks with residential once it has to.
  • 🧰 Anti-bot by design — uses impit browser impersonation (real TLS/HTTP fingerprints) instead of heavy headless browsers.
  • 💾 Live results — products stream into the output table as they're scraped, grouped by source section, so a mid-run stop never loses data.
  • Reviews & specs — opt into reviews and get full idml specifications.
  • 🧩 Customizable output — reshape every record with your own Python hooks.

🔑 Key Features

FeatureDescription
Category scrapingAuto-paginates browse/category pages
Search scrapingSearch pages or a raw keyword
Product detailDirect /ip/ URL extraction
ReviewsincludeReviews / onlyReviews
LimitsmaxItems (global) and endPage
LocationBest-effort zipCode targeting
Proxydirect → datacenter → residential (sticky)

📥 Input

{
"startUrls": [
{ "url": "https://www.walmart.com/browse/auto-tires/brake-pads/91083_1074765_9038935_4670095_4582920" }
],
"search": "laptop",
"maxItems": 10,
"endPage": null,
"zipCode": "10001",
"includeReviews": false,
"onlyReviews": false,
"proxy": { "useApifyProxy": false }
}
FieldTypeDescription
startUrlsarrayWalmart category / search / product URLs (bulk). Required.
searchstringKeyword → converted to a search URL.
maxItemsintegerCap on total products. Empty = no limit.
endPageintegerLast category/search page to read.
zipCodestringUS ZIP for localized pricing/availability.
postalCodeinteger⚠️ Deprecated — use zipCode.
includeReviewsbooleanAttach reviews to each product.
onlyReviewsbooleanKeep only reviews + identifiers.
extendOutputFunctionstringPython def extendOutputFunction(product) → dict merged in.
outputFilterFunctionstringPython def outputFilterFunction(product) → reshape/drop.
proxyobjectProxy config. Default: no proxy (auto-escalates on block).

📤 Output

Each product is pushed as one dataset row with the full Walmart product object plus convenience columns for the table view:

{
"name": "MAX Advanced Brakes - Brake Kit ...",
"brand": "Max Advanced Brakes",
"priceString": "$194.99",
"price": 194.99,
"availabilityStatus": "IN_STOCK",
"usItemId": "1902495893",
"productUrl": "https://www.walmart.com/ip/.../1902495893",
"imageUrl": "https://i5.walmartimages.com/seo/...jpeg",
"sourceSection": "browse_auto_tires",
"sourceUrl": "https://www.walmart.com/browse/...",
"priceInfo": { "currentPrice": { "price": 194.99, "priceString": "$194.99" } },
"idml": { "specifications": { }, "longDescription": "..." },
"reviews": null
}

A structured, per-section summary (mirroring results_by_url) is also written to the key-value store as OUTPUT.

🚀 How to Use (Apify Console)

  1. Log in at https://console.apify.comActors.
  2. Open Walmart Data Extractor.
  3. Paste your Walmart URLs (or a keyword), set maxItems, and configure proxy.
  4. Click Start.
  5. Watch products stream into the run log and Output tab in real time.
  6. Export to JSON / CSV / XLSX when done.

🤖 Use via API

curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://www.walmart.com/search?q=laptop"}],"maxItems":10}'

🎯 Best Use Cases

  • 💰 Price monitoring & repricing
  • 📊 Catalog & assortment analysis
  • 🔎 Competitor & market research
  • 🏷️ Brand / seller tracking

💳 Pricing

This actor uses the pay-per-event model. The primary event is row_result, charged once per product saved to the dataset. Platform startup is covered by the synthetic apify-actor-start event. You only pay for the products you actually receive.

❓ FAQ

Which URLs are supported? Category/browse pages, search pages, and product (/ip/) pages.

Do I need a proxy? No. The actor runs direct by default and only escalates to datacenter then residential proxies if Walmart blocks the request.

Can I limit the run? Yes — use maxItems for a global cap and endPage to stop pagination early.

Why are some fields null? Walmart omits fields per product; reviews are only attached when includeReviews/onlyReviews is enabled.

  • Data is collected only from publicly available Walmart pages.
  • You are responsible for compliance with Walmart's ToS and applicable laws (GDPR, CCPA, etc.). Use reasonable rate limits and scrape responsibly.

🛟 Support & Feedback

Open an issue on the Actor's Issues tab with your run ID and input, and we'll take a look.