# Shopify Scraper Pro (`igview-owner/shopify-scraper`) Actor

Scrape Shopify products fast with pagination, collections, product detail, and ordered lists. Clean JSON output to Apify datasets for instant exports.

- **URL**: https://apify.com/igview-owner/shopify-scraper.md
- **Developed by:** [Sachin Kumar Yadav](https://apify.com/igview-owner) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 19 total users, 2 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $5.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Shopify Store Scraper (Apify) 🛍️⚡

Collect clean, structured product data from any public Shopify store URL. This Apify Actor fetches product listings with pagination and supports optional grouping by collection. Built for reliability, batching, and scale.

Target keyword: Shopify Scraper

Related keywords: Shopify product scraper, Shopify store crawler, Shopify data extractor, Shopify inventory scraper, ecommerce product scraping, Shopify API alternative.

---

### Table of Contents 📚

- **[Features](#features)**
- **[How it works](#how-it-works)**
- **[Quick start](#quick-start)**
- **[Input configuration](#input-configuration)**
- **[Example inputs](#example-inputs)**
- **[Example outputs](#example-outputs)**
- **[Exports](#exports)**
- **[Best practices](#best-practices)**
- **[FAQ](#faq)**
- **[Changelog](#changelog)**

---

### Features ✨

| Capability             | What you get                                                            | Input switches                                  |
|------------------------|--------------------------------------------------------------------------|-------------------------------------------------|
| Store scrape           | Products from a Shopify store homepage URL                               | `store_enabled`, `shop_url`                     |
| Single page            | Fetch a single page per run                                              | `page`                                          |
| Group by collection    | Optionally group products by collection if supported                     | `group_by_collection`                           |
| Product detail         | Full details for specific product URLs                                   | `product_enabled`, `product_urls`               |
| Collection scrape      | Products from a single collection URL                                    | `collection_enabled`, `collection_url`          |
| Ordered products       | Best-sellers or other sort orders from store/collection                  | `ordered_enabled`, `ordered_scope`, `ordered_sort_type`, `ordered_collection_url` |
| High-throughput output | Buffered writes to Apify dataset for speed and stability                 | —                                               |

Use cases: price monitoring, assortment analysis, inventory snapshots, competitor research, and market intelligence.

---

### How it works 🧠

- **Resilient requests** with lightweight retries and key rotation.
- **Buffered writes** to the Apify default dataset for performance.
- **Simple inputs** to control store URL, pagination, and grouping.

---

### Quick start ⚡

1. Open the Actor and provide input JSON using the schema below.
2. Start the Actor. Results are written to the default dataset.

---

### Input configuration ⚙️

| Key                   | Type           | Default | Description                                                            |
|-----------------------|----------------|---------|------------------------------------------------------------------------|
| `shop_url`            | string         | —       | Shopify store homepage URL, e.g. `https://shop.flipperzero.one`        |
| `page`                | integer        | 1       | Page number to fetch                                                   |
| `group_by_collection` | boolean        | false   | Whether to group products by collection (if supported)                 |

---

### Example inputs 📥

```json
{
  "shop_url": "https://shop.flipperzero.one",
  "page": 1,
  "group_by_collection": false
}
````

- Product details

```json
{
  "product_enabled": true,
  "product_urls": [
    "https://www.decathlon.com/products/simond-mt100-hooded-down-puffer-jacket-167571"
  ]
}
```

- Collection scrape

```json
{
  "collection_enabled": true,
  "collection_url": "https://alpineskin.com/collections/skin-care-products"
}
```

- Ordered products (best-selling in a store)

```json
{
  "ordered_enabled": true,
  "ordered_scope": "store",
  "ordered_sort_type": "best-selling",
  "shop_url": "https://lootcrate.com/"
}
```

***

### Example outputs 📤

- **Product item** (one object per product)

```json
{
  "source": "shopify",
  "page": 1,
  "id": 7728481206425,
  "title": "Flipper Zero Transparent",
  "handle": "flipper-zero-transparent",
  "body_html": "<p>...</p>",
  "published_at": "2025-10-10T18:00:49Z",
  "created_at": "2023-09-20T13:36:55Z",
  "updated_at": "2025-10-20T02:32:22Z",
  "vendor": "Flipper Devices",
  "product_type": "Flipper Zero",
  "tags": ["main", "sealed"],
  "variants": [
    {
      "id": 43400767864985,
      "title": "Default Title",
      "price": "199.00",
      "available": true
    }
  ],
  "images": [
    { "src": "https://cdn.shopify.com/.../file.png", "width": 1961, "height": 1960 }
  ],
  "options": [
    { "name": "Title", "values": ["Default Title"] }
  ]
}
```

- **Summary** (final item)

```json
{
  "success": true,
  "pages_fetched": 3,
  "total_pushed": 421,
  "fetched_at": "2025-01-01T12:34:56.000Z",
  "resultsMeta": { "shop_url": "https://shop.flipperzero.one", "pages_processed": [1, 2, 3] }
}
```

***

### Exports 📦

- **Dataset**: All items are written to the default Apify dataset.
- Export to CSV/JSON/Parquet via Apify UI or API as needed.

***

### Best practices 🧩

- **Start small**: test with a single page (`start_page = end_page = 1`).
- **Respect limits**: keep page ranges reasonable to avoid long runs.
- **Post-process**: map `variants` and `images` into your analytics schema after export.

***

### FAQ ❓

- **Do I need a Shopify account?**
  - No. The Actor works on public store data.
- **Where do results go?**
  - The default Apify dataset for the run.
- **Can I filter by collection?**
  - Use `group_by_collection: true` to organize by collection if available, then filter after export.
- **Which regions are supported?**
  - Works globally for publicly accessible Shopify stores (subject to store availability and access rules).

***

### Changelog 🗓️

- v1.0.0: Initial release — store products, pagination, optional grouping by collection.

***

# Actor input Schema

## `shop_url` (type: `string`):

🔗 Shopify store homepage Example: https://alpineskin.com

## `start_page` (type: `integer`):

🟢 First page number to fetch. Use 1 for the first page.

## `end_page` (type: `integer`):

🔚 Last page number to fetch (inclusive). Leave empty to scrape only the start page.

## `limit` (type: `integer`):

📦 Number of products to request per page (max 250 on most Shopify stores).

## `max_items` (type: `integer`):

🚦 Hard cap for total *products* across all pages. The actor stops once this many products are saved. Leave empty for no global product limit.

## `products_enabled` (type: `boolean`):

✅ When enabled, load products from the paginated products list. Turn this off to run only single product URLs (and optionally collections).

## `product_urls` (type: `array`):

🔗 Full product URLs, e.g. https://alpineskin.com/products/truecider-creamy-conditioner or collection product URLs. One URL per line.

## `collections_enabled` (type: `boolean`):

🗂️ When enabled, save all collections

## `collections_limit` (type: `integer`):

🔢 Limit how many collections will be stored. Leave empty to keep all collections.

## Actor input object example

```json
{
  "shop_url": "https://alpineskin.com",
  "start_page": 1,
  "limit": 50,
  "products_enabled": true
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "shop_url": "https://alpineskin.com",
    "start_page": 1,
    "limit": 50,
    "products_enabled": true,
    "collections_enabled": false
};

// Run the Actor and wait for it to finish
const run = await client.actor("igview-owner/shopify-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "shop_url": "https://alpineskin.com",
    "start_page": 1,
    "limit": 50,
    "products_enabled": True,
    "collections_enabled": False,
}

# Run the Actor and wait for it to finish
run = client.actor("igview-owner/shopify-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "shop_url": "https://alpineskin.com",
  "start_page": 1,
  "limit": 50,
  "products_enabled": true,
  "collections_enabled": false
}' |
apify call igview-owner/shopify-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=igview-owner/shopify-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Shopify Scraper Pro",
        "description": "Scrape Shopify products fast with pagination, collections, product detail, and ordered lists. Clean JSON output to Apify datasets for instant exports.",
        "version": "1.0",
        "x-build-id": "ydjOamW7AvtBN03jf"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/igview-owner~shopify-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-igview-owner-shopify-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/igview-owner~shopify-scraper/runs": {
            "post": {
                "operationId": "runs-sync-igview-owner-shopify-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/igview-owner~shopify-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-igview-owner-shopify-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "shop_url"
                ],
                "properties": {
                    "shop_url": {
                        "title": "Shop URL",
                        "type": "string",
                        "description": "🔗 Shopify store homepage Example: https://alpineskin.com"
                    },
                    "start_page": {
                        "title": "Start page",
                        "minimum": 1,
                        "type": "integer",
                        "description": "🟢 First page number to fetch. Use 1 for the first page."
                    },
                    "end_page": {
                        "title": "End page (optional)",
                        "minimum": 1,
                        "type": "integer",
                        "description": "🔚 Last page number to fetch (inclusive). Leave empty to scrape only the start page."
                    },
                    "limit": {
                        "title": "Products per page (limit)",
                        "minimum": 1,
                        "maximum": 250,
                        "type": "integer",
                        "description": "📦 Number of products to request per page (max 250 on most Shopify stores)."
                    },
                    "max_items": {
                        "title": "Maximum products to fetch (optional)",
                        "minimum": 1,
                        "type": "integer",
                        "description": "🚦 Hard cap for total *products* across all pages. The actor stops once this many products are saved. Leave empty for no global product limit."
                    },
                    "products_enabled": {
                        "title": "Fetch products list",
                        "type": "boolean",
                        "description": "✅ When enabled, load products from the paginated products list. Turn this off to run only single product URLs (and optionally collections)."
                    },
                    "product_urls": {
                        "title": "Single product URLs",
                        "type": "array",
                        "description": "🔗 Full product URLs, e.g. https://alpineskin.com/products/truecider-creamy-conditioner or collection product URLs. One URL per line.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "collections_enabled": {
                        "title": "Also fetch collections",
                        "type": "boolean",
                        "description": "🗂️ When enabled, save all collections"
                    },
                    "collections_limit": {
                        "title": "Max collections to save (optional)",
                        "minimum": 1,
                        "type": "integer",
                        "description": "🔢 Limit how many collections will be stored. Leave empty to keep all collections."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
