# JYSK Scraper — European Furniture & Home Products (`studio-amba/jysk-scraper`) Actor

Scrape products, prices, ratings, and reviews from JYSK.de. European furniture and home products retailer. Supports search, category browsing, and full catalog via sitemap.

- **URL**: https://apify.com/studio-amba/jysk-scraper.md
- **Developed by:** [Studio Amba](https://apify.com/studio-amba) (community)
- **Categories:** E-commerce
- **Stats:** 2 total users, 0 monthly users, 76.7% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $4.00 / 1,000 result scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## JYSK Scraper

Scrapes furniture, mattresses, and home goods from jysk.de. Combines JYSK's search API for discovery with page-level JSON-LD for detailed product data.

### Input

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `searchQuery` | String | No | Search term (e.g., "matratze", "schreibtisch") |
| `categoryUrl` | String | No | JYSK category URL |
| `maxResults` | Integer | No | Max products (default: 100) |
| `proxyConfiguration` | Object | No | Proxy settings |

Without input, the scraper fetches `sitemap.xml` and crawls product pages from there.

### Output

| Field | Type | Example |
|-------|------|---------|
| `name` | String | `"Matratze GOLD S70 90x200"` |
| `brand` | String | `"JYSK"` |
| `price` | Number | `199.00` |
| `originalPrice` | Number | `279.00` |
| `currency` | String | `"EUR"` |
| `ean` | String | `"5709132891234"` |
| `sku` | String | `"12345678"` |
| `inStock` | Boolean | `true` |
| `rating` | Number | `4.3` |
| `reviewCount` | Number | `156` |
| `imageUrl` | String | Product photo |
| `imageUrls` | Array | Gallery images |
| `description` | String | Product description |
| `category` | String | `"Matratzen"` |
| `categories` | Array | `["Schlafzimmer", "Matratzen"]` |
| `language` | String | `"de"` |

### Three discovery strategies

1. **Search API**: Hits `/api/search` with faceted parameters. Returns product URLs which are then scraped individually for full data.
2. **Category crawl**: Follows product links on category pages (URLs with 3+ path segments).
3. **Sitemap** (default): Parses `sitemap.xml`, filters for product URLs, and scrapes them.

Product pages are enriched beyond JSON-LD: breadcrumb categories, sale prices from HTML, and online stock indicators are all merged in.

### Cost

About **$0.25 per 1,000 products**.

### Notes

- German JYSK site. Content in German.
- JYSK product URLs have at least 3 path segments: `/{category}/{subcategory}/{product-slug}`
- The search API returns paginated results with `totalPages` and `totalCount` fields.
- Breadcrumb labels like "Startseite" and "Suchergebnisse" are automatically filtered out.

### Why use JYSK Scraper

- **Price monitoring** — Track prices, stock, and promotions across JYSK at scale
- **Competitive intelligence** — Compare your catalog against JYSK pricing and assortment
- **Market research** — Analyze category trends, new arrivals, and rating distributions
- **Lead generation** — Build product datasets for affiliate sites, comparison tools, or feeds
- **No login or cookies required** — Authenticated access not needed; works out of the box

### How to use JYSK Scraper

1. Open the **Input** tab and provide a search query, category URL, or product list
2. Adjust optional filters such as `maxResults` or proxy settings
3. Click **Start** and wait for the run to complete
4. Download results from the **Output** tab in JSON, CSV, Excel, XML, or HTML
5. Schedule recurring runs from the **Schedule** tab if you need ongoing data

### How to scrape JYSK data

This Actor automates the process of extracting structured product data from JYSK.
You can run it directly from the Apify console, the Apify API, or any of the
official SDKs (JavaScript, Python). The scraper handles pagination, retries, and
rate limiting so you can focus on the data, not the plumbing.

Typical workflows:

- **One-off export**: paste a category URL or keyword, set `maxResults`, and run
- **Scheduled monitoring**: set a daily cron in the Schedule tab to track prices over time
- **Programmatic integration**: trigger runs from your backend via the Apify API and
  pull the dataset when finished
- **Webhook automation**: receive a callback the moment a run completes and pipe
  the results into Zapier, Make, n8n, BigQuery, or Google Sheets

### Tips for best results

- **Start small** — run with `maxResults: 10` before launching large jobs
- **Use proxies** — residential proxies reduce blocking on protected sites
- **Throttle on big jobs** — keep `maxConcurrency` modest (5–10) for stability
- **Schedule runs** — daily runs are usually enough for price monitoring
- **Inspect the dataset schema** — the Storage tab shows the full output structure

### FAQ and support

**Is it legal to scrape JYSK?** This Actor extracts publicly available data.
Always review the website's Terms of Service before scraping at scale, and
respect rate limits.

**Why am I getting fewer results than expected?** Some categories have hidden
pagination or load more on scroll. Increase `maxResults` and verify your filters.

**Can I extract data for a single product?** Yes — provide the full product URL
in `startUrls` and the scraper will return one item.

**The site blocks me — what should I do?** Enable Apify residential proxies in
the input. Datacenter IPs are blocked by many e-commerce sites.

For issues, feature requests, or bug reports, open a ticket in the Issues tab on
the Actor page or contact support@apify.com. We monitor every actor and ship
fixes quickly when sites change.

# Actor input Schema

## `searchQuery` (type: `string`):

Search for products by keyword (e.g., 'matratze', 'schreibtisch', 'sofa'). Uses JYSK's internal search API.
## `categoryUrl` (type: `string`):

A JYSK category page URL to scrape. Example: https://www.jysk.de/schlafzimmer/matratzen. If empty and no search query, scrapes from sitemap.
## `maxResults` (type: `integer`):

Maximum number of products to return.
## `proxyConfiguration` (type: `object`):

Proxy settings for better reliability.

## Actor input object example

```json
{
  "searchQuery": "matratze",
  "maxResults": 100,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "DK"
  }
}
````

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQuery": "matratze",
    "proxyConfiguration": {
        "useApifyProxy": true,
        "apifyProxyGroups": [
            "RESIDENTIAL"
        ],
        "apifyProxyCountry": "DK"
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("studio-amba/jysk-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchQuery": "matratze",
    "proxyConfiguration": {
        "useApifyProxy": True,
        "apifyProxyGroups": ["RESIDENTIAL"],
        "apifyProxyCountry": "DK",
    },
}

# Run the Actor and wait for it to finish
run = client.actor("studio-amba/jysk-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQuery": "matratze",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ],
    "apifyProxyCountry": "DK"
  }
}' |
apify call studio-amba/jysk-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=studio-amba/jysk-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "JYSK Scraper — European Furniture & Home Products",
        "description": "Scrape products, prices, ratings, and reviews from JYSK.de. European furniture and home products retailer. Supports search, category browsing, and full catalog via sitemap.",
        "version": "0.1",
        "x-build-id": "7CUs5iqSRgi6fVbSh"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/studio-amba~jysk-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-studio-amba-jysk-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/studio-amba~jysk-scraper/runs": {
            "post": {
                "operationId": "runs-sync-studio-amba-jysk-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/studio-amba~jysk-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-studio-amba-jysk-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchQuery": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search for products by keyword (e.g., 'matratze', 'schreibtisch', 'sofa'). Uses JYSK's internal search API."
                    },
                    "categoryUrl": {
                        "title": "Category URL",
                        "type": "string",
                        "description": "A JYSK category page URL to scrape. Example: https://www.jysk.de/schlafzimmer/matratzen. If empty and no search query, scrapes from sitemap."
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 1,
                        "maximum": 50000,
                        "type": "integer",
                        "description": "Maximum number of products to return.",
                        "default": 100
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings for better reliability."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
