# Di Scraper — Belgian Beauty, Makeup & Cosmetics Products (`studio-amba/di-scraper`) Actor

Scrape beauty products, prices, and brands from Di.be — Belgium's #1 makeup retailer. Get makeup, skincare, perfumes, and hair care products with full details.

- **URL**: https://apify.com/studio-amba/di-scraper.md
- **Developed by:** [Studio Amba](https://apify.com/studio-amba) (community)
- **Categories:** E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $5.00 / 1,000 result scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Di Scraper

Pulls beauty product data from Di.be (formerly ICI PARIS XL's online presence) -- Belgium's largest makeup and skincare retailer. Uses Di's Algolia search index directly for fast, structured results.

### How it works

Rather than scraping HTML pages, this actor queries Di.be's public Algolia search API (`4VDIKKUIQ0` app). This returns clean, structured product data instantly -- no browser needed, no anti-bot concerns.

You can filter by brand, category, or search text. The Algolia index supports both Dutch (`di_be_nl`) and French (`di_be_fr`) catalogs.

### Input

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `searchQuery` | String | No | e.g. `"mascara"`, `"Chanel"` |
| `category` | String | No | Category name filter |
| `brand` | String | No | Brand name filter (e.g. `"MAC"`, `"Dior"`) |
| `maxResults` | Integer | No | Default: 100 |
| `language` | String | No | `nl` or `fr` (default: `fr`) |

### Output

| Field | Type | Example |
|-------|------|---------|
| `name` | String | `"Lash Sensational Sky High Mascara"` |
| `brand` | String | `"Maybelline"` |
| `price` | Number | `13.99` |
| `currency` | String | `"EUR"` |
| `productId` | String | `"MAY-SKY-001"` |
| `inStock` | Boolean | `true` |
| `imageUrl` | String | Product image |
| `category` | String | `"Ogen > Mascara"` |
| `categories` | Array | `["Ogen", "Mascara"]` |
| `color` | String | `"Black"` |
| `colors` | Array | `["Black", "Brown"]` |
| `productType` | String | `"mascara"` |
| `promoFlags` | Array | `["Nieuw"]` |
| `language` | String | `"fr"` |

### Cost

Pure API calls. About **0.05 CU per 1,000 products**.

### Limitations

- No product descriptions or detailed specs (Algolia index only has listing-level data)
- The `category` and `brand` filters must match exact values used in Di's facets
- Belgian market only (di.be)

### How to scrape Di data

1. Go to this actor's page on the [Apify Store](https://apify.com/store).
2. Click **Try for free** to open it in Apify Console.
3. Configure your search query or URL, set the maximum number of results, and adjust proxy settings if needed.
4. Click **Start** and wait for the run to finish.
5. Download your data in JSON, CSV, Excel, or connect it to your workflow via API.

You can also schedule regular runs, set up webhooks for real-time notifications, or integrate the results directly into your application using the [Apify API](https://docs.apify.com/api).

### Tips and tricks

- **Start small**: test with `maxResults: 5` before running large scrapes.
- **Use proxies**: residential proxies give the best success rates for Di.
- **Schedule runs**: set up recurring runs to keep your data fresh automatically.
- **Integrate via API**: use the [Apify API](https://docs.apify.com/api) or [client libraries](https://docs.apify.com/api/client) to fetch results programmatically.
- **No login required**: this actor scrapes publicly available data without needing an account.

### Features

- **No login required** — scrapes publicly available data from Di without needing credentials or cookies.
- **Structured output** — results are returned as clean JSON objects, ready for processing.
- **Pagination handling** — automatically follows multiple pages of results.
- **Proxy support** — configurable proxy settings for reliable, large-scale scraping.
- **Flexible input** — search by keyword, provide specific URLs, or crawl categories.
- **Scheduled runs** — run on a schedule to keep your dataset up to date automatically.
- **API access** — integrate results into your workflow using the Apify API or webhooks.

### FAQ

**Is it legal to scrape Di?**
Web scraping of publicly available data is generally permitted. This actor only accesses information that is publicly visible to any website visitor. Always review the website's terms of service before scraping.

**How often should I run this scraper?**
For price monitoring or competitive intelligence, daily or weekly runs are common. Set up a [schedule](https://docs.apify.com/schedules) in Apify Console to automate this.

**Can I export the data to Google Sheets or Excel?**
Yes. After each run, you can download results in CSV, JSON, or Excel format directly from Apify Console. You can also connect results to Google Sheets using Apify integrations.

**What if the scraper stops working?**
Websites change their structure occasionally. If you notice issues, please open an issue on the actor's page. We actively maintain this scraper and fix issues promptly.

**Can I export the data to Excel or Google Sheets?**

Yes. Every run's dataset can be downloaded from the Apify Console as CSV, Excel, JSON, or XML, or pulled automatically via the Apify API and integrations (Google Sheets, Zapier, Make).

**How fresh is the data?**

Each run queries Di.be's live Algolia index, so prices and availability reflect the site at the moment the run executes. Schedule the actor to keep your dataset up to date.

### Need help?

If you have questions about this actor, need a custom modification, or want automated data delivery, reach out at [studioamba.dev](https://studioamba.dev) or open an issue on this actor's page.

We maintain 300+ web scrapers across Europe and offer managed data services for businesses that need structured data on a recurring basis.

# Actor input Schema

## `searchQuery` (type: `string`):

Search for products by keyword (e.g., 'mascara', 'parfum', 'shampoo'). Leave empty to get all products.
## `category` (type: `string`):

Filter by category name (e.g., 'Maquillage', 'Parfum', 'Soin des cheveux'). Case-sensitive.
## `brand` (type: `string`):

Filter by brand name (e.g., 'MAYBELLINE NEW YORK', 'L'ORÉAL PARIS'). Case-sensitive.
## `maxResults` (type: `integer`):

Maximum number of products to return.
## `language` (type: `string`):

Language for product data.
## `proxyConfiguration` (type: `object`):

Proxy settings. Usually not needed for Di.be.

## Actor input object example

```json
{
  "searchQuery": "mascara",
  "maxResults": 100,
  "language": "fr",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "BE"
  }
}
````

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQuery": "mascara",
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ],
        "apifyProxyCountry": "BE"
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("studio-amba/di-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchQuery": "mascara",
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
        "apifyProxyCountry": "BE",
    },
}

# Run the Actor and wait for it to finish
run = client.actor("studio-amba/di-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQuery": "mascara",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "BE"
  }
}' |
apify call studio-amba/di-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=studio-amba/di-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Di Scraper — Belgian Beauty, Makeup & Cosmetics Products",
        "description": "Scrape beauty products, prices, and brands from Di.be — Belgium's #1 makeup retailer. Get makeup, skincare, perfumes, and hair care products with full details.",
        "version": "0.1",
        "x-build-id": "r1lzWsxzk8ONB5hfq"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/studio-amba~di-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-studio-amba-di-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/studio-amba~di-scraper/runs": {
            "post": {
                "operationId": "runs-sync-studio-amba-di-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/studio-amba~di-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-studio-amba-di-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search for products by keyword (e.g., 'mascara', 'parfum', 'shampoo'). Leave empty to get all products."
                    },
                    "category": {
                        "title": "Category Filter",
                        "type": "string",
                        "description": "Filter by category name (e.g., 'Maquillage', 'Parfum', 'Soin des cheveux'). Case-sensitive."
                    },
                    "brand": {
                        "title": "Brand Filter",
                        "type": "string",
                        "description": "Filter by brand name (e.g., 'MAYBELLINE NEW YORK', 'L'ORÉAL PARIS'). Case-sensitive."
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of products to return.",
                        "default": 100
                    },
                    "language": {
                        "title": "Language",
                        "enum": [
                            "nl",
                            "fr"
                        ],
                        "type": "string",
                        "description": "Language for product data.",
                        "default": "fr"
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings. Usually not needed for Di.be."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
