# Airbnb Property Scraper (`corent1robert/airbnb-property-scraper`) Actor

High-performance Airbnb property data scraper with optimized pagination and memory management. Extracts detailed property information (URLs, titles, prices, travel dates, images) from Airbnb search results with 2.5x faster performance using 8GB memory configuration.

- **URL**: https://apify.com/corent1robert/airbnb-property-scraper.md
- **Developed by:** [Corentin Robert](https://apify.com/corent1robert) (community)
- **Categories:** Travel, Developer tools, Automation
- **Stats:** 28 total users, 2 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $4.99 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Airbnb Lodging URLs Scraper

An Apify scraper to extract all Airbnb lodging URLs from a city or region.

### 🎯 Features

- **Complete extraction** : Retrieves all lodging URLs from an Airbnb search
- **Detailed property data** : Extracts title, price, host type, rating, reviews, travel dates, and images
- **Smart image selection** : Gets property images (not host profile photos)
- **Price intelligence** : Handles discounts and price variations
- **Date extraction** : Accurate travel date detection
- **Automatic navigation** : Browses through all result pages
- **Smart scrolling** : Loads all available listings
- **Clean URLs** : Removes unnecessary search parameters
- **Robustness** : Error handling and multiple fallbacks

### 📊 Extracted Data

The scraper extracts detailed property information:

```json
{
  "url": "https://www.airbnb.fr/rooms/1169471441730393904",
  "title": "Appartement \"Flore\"",
  "price": "626 €",
  "hostType": "Hôte particulier",
  "rating": "4,97",
  "reviewCount": "30",
  "travelDates": "7–12 nov.",
  "imageUrl": "https://a0.muscache.com/im/pictures/hosting/Hosting-1169471441730393904/original/e25c6d0f-cfbf-44c8-b68d-e0febce169ae.jpeg?im_w=720",
  "extractedAt": "2025-10-25T12:43:09.521Z"
}
````

#### Detailed Fields

- **url** : Property URL
- **title** : Property title or name
- **price** : Current price per night (with discount if applicable)
- **hostType** : Type of host ("Hôte particulier" or "Hôte pro")
- **rating** : Property rating (e.g., "4,97")
- **reviewCount** : Number of reviews (e.g., "30")
- **travelDates** : Travel dates for the booking (e.g., "7–12 nov.")
- **imageUrl** : Property main image URL
- **extractedAt** : Extraction timestamp

### 🚀 Usage

#### Input Format

The scraper accepts two input formats:

##### 1. City name only

```json
{
  "input": "Verneuil-sur-Seine"
}
```

##### 2. Full Airbnb search URL (with /homes at the end)

```json
{
  "input": "https://www.airbnb.fr/s/verneuil-sur-seine/homes"
}
```

The scraper automatically detects the format and constructs the appropriate search URL.

#### Examples

##### City name (Recommended)

```json
{
  "input": "Lyon"
}
```

##### Full URL with /homes

```json
{
  "input": "https://www.airbnb.fr/s/lyon/homes"
}
```

##### City with spaces and special characters

```json
{
  "input": "Verneuil-sur-Seine"
}
```

##### Full URL for complex cities

```json
{
  "input": "https://www.airbnb.fr/s/verneuil-sur-seine/homes"
}
```

### ⚡ Performance

- **Automatic navigation** : Browses through all available pages
- **Smart detection** : Automatically stops when no more pages
- **Clean URLs** : Removes search parameters
- **Deduplication** : Avoids duplicate URLs

### 🛠️ Technical Architecture

#### Technologies Used

- **Apify SDK** : Automation framework
- **PuppeteerCrawler** : Navigation and extraction
- **Puppeteer** : Browser control
- **Node.js** : JavaScript runtime

#### Extraction Process

1. **Navigation** : Access to Airbnb search page
2. **Cookie management** : Automatic cookie acceptance
3. **Property extraction** : Retrieval of lodging data (URL, title, price, host type, rating, reviews, dates, images)
4. **Smart image selection** : Property images (not host photos)
5. **Price processing** : Handles discounts and price variations
6. **Date extraction** : Accurate travel date detection
7. **Automatic navigation** : Moving to next pages
8. **Cleaning** : Removal of unnecessary parameters
9. **Deduplication** : Elimination of duplicates

### 📈 Typical Results

- **250+ properties** extracted per city
- **Complete data** : URL, title, price, host type, rating, reviews, dates, and images
- **15-20 pages** browsed on average
- **Execution time** : ~1-2 minutes per city (optimized)
- **Accuracy** : Extraction of all available properties with detailed information

### ⚙️ Recommended Configuration

#### **Optimal Settings for Apify:**

```
MEMORY: 8 GB (recommended)
TIMEOUT: 7200s (2 hours)
RESTART ON ERROR: ON
```

#### **Why 8GB Memory?**

- **Puppeteer + Chrome** : ~4-5GB base consumption
- **Airbnb pages** : ~1-2GB for complex layouts
- **Navigation buffer** : ~1GB for smooth transitions
- **Total recommended** : 8GB for optimal performance

#### **Performance Comparison:**

| Memory | Speed | Stability | Recommendation |
|--------|-------|-----------|----------------|
| 4GB | ❌ Slow | ❌ Crashes | ❌ Not recommended |
| 6GB | ⚠️ Medium | ⚠️ Unstable | ⚠️ Minimum |
| **8GB** | ✅ **Fast** | ✅ **Stable** | ✅ **Recommended** |
| 16GB | ✅ Very fast | ✅ Very stable | ✅ Optimal |

#### **Alternative (Budget Option):**

```
MEMORY: 6 GB (minimum)
TIMEOUT: 5400s (1.5 hours)
RESTART ON ERROR: ON
```

#### **⚠️ Important: Manual Configuration**

If the default settings don't apply automatically, you can manually configure them in the Apify console:

1. **Go to your Actor** in the Apify console
2. **Click on "Settings"** tab
3. **Set the following values:**
   - **Memory**: 8192 MB (8 GB)
   - **Timeout**: 7200 seconds (2 hours)
   - **Restart on error**: ON

This ensures optimal performance and prevents memory-related crashes.

### 🔍 Selectors Used

The scraper uses multiple selectors to maximize extraction:

- `[data-testid="card-container"] a[href*="/rooms/"]`
- `a[href*="/rooms/"]`
- `[data-testid="listing-card-link"]`

### 🚀 Deployment

The scraper is ready for deployment on Apify Cloud:

- **Complete configuration** : package.json, Dockerfile, actor.json
- **Optimized code** : Based on Airbnb HTML analysis
- **Documentation** : Complete README with examples

### 📝 Important Notes

- **Search URLs** : Use Airbnb search URLs (e.g., `/s/Paris--France`)
- **Pagination** : The scraper automatically detects next pages
- **Deduplication** : Duplicate URLs are automatically removed
- **Cleaning** : Search parameters are removed from URLs

### 📞 Support

For any questions or issues:

- Check execution logs
- Verify Airbnb HTML structure
- Adapt selectors if necessary
- Verify that the search URL is valid

# Actor input Schema

## `input` (type: `string`):

City name (e.g. Lyon, Bordeaux) or a full Airbnb URL (https://www.airbnb.fr/s/lyon/homes). The scraper auto-detects the format.

## `maxResults` (type: `integer`):

Maximum number of listings to extract. 0 or empty = no cap (up to 1 000 requests). Set 20–50 for a quick test.

## Actor input object example

```json
{
  "input": "Lyon",
  "maxResults": 30
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "input": "Lyon",
    "maxResults": 30
};

// Run the Actor and wait for it to finish
const run = await client.actor("corent1robert/airbnb-property-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "input": "Lyon",
    "maxResults": 30,
}

# Run the Actor and wait for it to finish
run = client.actor("corent1robert/airbnb-property-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "input": "Lyon",
  "maxResults": 30
}' |
apify call corent1robert/airbnb-property-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=corent1robert/airbnb-property-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Airbnb Property Scraper",
        "description": "High-performance Airbnb property data scraper with optimized pagination and memory management. Extracts detailed property information (URLs, titles, prices, travel dates, images) from Airbnb search results with 2.5x faster performance using 8GB memory configuration.",
        "version": "1.0",
        "x-build-id": "KdAtsBRyi774gC9aO"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/corent1robert~airbnb-property-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-corent1robert-airbnb-property-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/corent1robert~airbnb-property-scraper/runs": {
            "post": {
                "operationId": "runs-sync-corent1robert-airbnb-property-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/corent1robert~airbnb-property-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-corent1robert-airbnb-property-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "input"
                ],
                "properties": {
                    "input": {
                        "title": "Airbnb search URL or City name",
                        "type": "string",
                        "description": "City name (e.g. Lyon, Bordeaux) or a full Airbnb URL (https://www.airbnb.fr/s/lyon/homes). The scraper auto-detects the format.",
                        "default": "Lyon"
                    },
                    "maxResults": {
                        "title": "Max results (optional)",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of listings to extract. 0 or empty = no cap (up to 1 000 requests). Set 20–50 for a quick test.",
                        "default": 0
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
