# Centris Broker Scraper (`ocrad/centris-broker-scraper`) Actor

Extract broker details from Centris.ca, easily scrape data including broker name, phone number, properties listed and more. Perfect for real estate analysis, investment research and market insights.

- **URL**: https://apify.com/ocrad/centris-broker-scraper.md
- **Developed by:** [Ocrad](https://apify.com/ocrad) (community)
- **Categories:** Real estate, Lead generation, Automation
- **Stats:** 30 total users, 0 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

$20.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Centris Broker Scraper

A production-ready scraper for extracting comprehensive real estate broker information from Centris.ca using PlaywrightCrawler. This scraper extracts detailed broker profiles, social media links, and optionally fetches their property listings with photos and metadata.

### ✨ Features

#### 🏘️ **Broker Information Extraction**
- **Personal Details**: Name, professional title, professional photo
- **Contact Information**: Phone, email, contact URLs
- **Identifiers**: Broker ID, profile ID for unique identification
- **Agency Information**: Agency name, logo, and type

#### 📱 **Social Media Integration**
- **LinkedIn**: Professional networking profiles
- **Facebook**: Business pages and personal profiles
- **Twitter/X**: Social media handles
- **Instagram**: Visual content accounts
- **YouTube**: Video content channels
- **TikTok**: Short-form video accounts

#### 🏠 **Property Listings (Optional)**
When `addBrokerProperties` is enabled, extracts:
- **Property URLs**: Direct links to property detail pages
- **Property IDs**: Unique identifiers extracted from URLs
- **Property Photos**: Image galleries with full URLs
- **Property Details**: Titles, addresses, prices, and types
- **Thumbnail Links**: Uses `.property-thumbnail-summary-link` selectors

#### 🔧 **Custom Data Extraction**
- **extendedOutputFunction**: Add custom data extraction logic
- **Page Access**: Full access to Playwright page object for advanced scraping
- **Flexible Integration**: Merge custom data with standard broker information
- **Error Isolation**: Custom function errors don't break the main scraping process

### 📋 Input Parameters

```json
{
  "startUrls": ["https://www.centris.ca/en/real-estate-brokers"],
  "maxItems": 100,
  "endPage": 5,
  "addBrokerProperties": false,
  "proxy": true,
  "extendedOutputFunction": "return { customField: 'value', timestamp: new Date().toISOString() };"
}
````

#### Parameter Details

- **startUrls** (array): Starting URLs for broker discovery
- **maxItems** (number): Maximum brokers to scrape (default: 100)
- **endPage** (number): Maximum pages to process (default: 5)
- **addBrokerProperties** (boolean): Include property listings (default: false)
- **proxy** (boolean): Use Apify proxy for requests (default: true)
- **extendedOutputFunction** (string): Custom JavaScript function for additional data extraction

````

### 🔧 Custom Data Extraction

The `extendedOutputFunction` allows you to add custom data extraction logic. The function receives three parameters:
- `broker`: The extracted broker object
- `page`: Playwright page object for additional scraping
- `log`: Logger object for debugging

#### Basic Example
```javascript
"extendedOutputFunction": "return { scrapedAt: new Date().toISOString(), source: 'centris' };"
````

#### Advanced Example - Extract Additional Contact Info

```javascript
"extendedOutputFunction": `
  const additionalData = {};
  
  // Extract additional phone numbers
  const phones = await page.$$eval('a[href^="tel:"]', links => 
    links.map(link => link.href.replace('tel:', ''))
  );
  if (phones.length > 0) {
    additionalData.additionalPhones = phones;
  }
  
  // Extract office hours
  try {
    const hours = await page.$eval('.office-hours', el => el.textContent.trim());
    additionalData.officeHours = hours;
  } catch (e) {
    // Office hours not found
  }
  
  // Add custom metadata
  additionalData.extractedAt = new Date().toISOString();
  additionalData.brokerScore = Math.floor(Math.random() * 100); // Custom scoring
  
  return additionalData;
`
```

#### Example - Extract Broker Reviews/Ratings

```javascript
"extendedOutputFunction": `
  const customData = {};
  
  try {
    // Extract star rating
    const rating = await page.$eval('.rating-stars', el => el.getAttribute('data-rating'));
    customData.rating = parseFloat(rating);
    
    // Extract review count
    const reviewCount = await page.$eval('.review-count', el => parseInt(el.textContent.match(/\\d+/)[0]));
    customData.reviewCount = reviewCount;
    
    // Extract recent reviews
    const reviews = await page.$$eval('.review-item', items => 
      items.slice(0, 3).map(item => ({
        text: item.querySelector('.review-text')?.textContent.trim(),
        date: item.querySelector('.review-date')?.textContent.trim(),
        author: item.querySelector('.review-author')?.textContent.trim()
      }))
    );
    customData.recentReviews = reviews;
    
  } catch (error) {
    log.info('Reviews not found for broker: ' + broker.name);
  }
  
  return customData;
`
```

#### Example - Market Area Analysis

```javascript
"extendedOutputFunction": `
  // Extract served areas and specializations
  const areaData = {};
  
  try {
    // Extract service areas
    const serviceAreas = await page.$$eval('.service-area', areas => 
      areas.map(area => area.textContent.trim())
    );
    areaData.serviceAreas = serviceAreas;
    
    // Extract specializations
    const specializations = await page.$$eval('.specialization', specs => 
      specs.map(spec => spec.textContent.trim())
    );
    areaData.specializations = specializations;
    
    // Extract languages spoken
    const languages = await page.$$eval('.language', langs => 
      langs.map(lang => lang.textContent.trim())
    );
    areaData.languages = languages;
    
  } catch (error) {
    log.info('Extended area data not found for: ' + broker.name);
  }
  
  // Add processing timestamp
  areaData.processedAt = new Date().toISOString();
  
  return areaData;
`
```

### 📊 Output Structure

#### Standard Broker Record

```json
{
  "url": "https://www.centris.ca/en/real-estate-broker~john-doe~agency-name/b1234",
  "name": "John Doe",
  "title": "Residential Real Estate Broker",
  "image": "https://mspublic.centris.ca/media.ashx?id=...",
  "brokerId": "123456",
  "profileId": "abc123...",
  "phone": "514-123-4567",
  "contactUrl": "https://www.centris.ca/en/contact-broker/b1234",
  "website": "https://johndoe-realestate.com",
  "socialMedia": {
    "linkedin": "https://linkedin.com/in/john-doe-realtor",
    "facebook": "https://facebook.com/johndoe.realestate",
    "twitter": "https://twitter.com/johndoe_realtor",
    "instagram": "https://instagram.com/johndoe_properties",
    "youtube": "https://youtube.com/c/johndoe-realestate",
    "tiktok": "https://tiktok.com/@johndoe_homes"
  },
  "agency": {
    "name": "Premium Real Estate Inc.",
    "image": "https://mspublic.centris.ca/media.ashx?id=...",
    "type": "Real Estate Agency"
  },
  "properties": [
    {
      "url": "https://www.centris.ca/en/property/12345-beautiful-home-montreal",
      "id": "12345",
      "title": "Beautiful 3-bedroom house in Montreal",
      "price": "$450,000",
      "type": "House",
      "photos": [
        "https://mspublic.centris.ca/media.ashx?id=...",
        "https://mspublic.centris.ca/media.ashx?id=..."
      ]
    }
  ]
}
```

#### Extended Broker Record (with custom function)

```json
{
  "url": "https://www.centris.ca/en/real-estate-broker~john-doe~agency-name/b1234",
  "name": "John Doe",
  "title": "Residential Real Estate Broker",
  "brokerId": "123456",
  "phone": "514-123-4567",
  "socialMedia": { "..." },
  "agency": { "..." },
  "properties": [ "..." ],
  
  "rating": 4.8,
  "reviewCount": 127,
  "recentReviews": [
    {
      "text": "Excellent service, very professional",
      "date": "2024-01-15",
      "author": "Sarah M."
    }
  ],
  "serviceAreas": ["Montreal", "Laval", "Longueuil"],
  "specializations": ["Condos", "First-time buyers", "Investment properties"],
  "languages": ["English", "French", "Spanish"],
  "extractedAt": "2024-01-20T10:30:00.000Z",
  "processedAt": "2024-01-20T10:30:00.000Z"
}
```

### ⚖️ Compliance & Best Practices

- **Rate Limiting**: Built-in delays and concurrency limits
- **Respectful Scraping**: Follows robots.txt and site guidelines
- **Data Privacy**: No storage of sensitive personal information
- **Error Handling**: Robust fallbacks prevent service disruption
- **Custom Code Safety**: Extended functions are isolated and error-handled

### 📄 License

This project is for educational and research purposes. Ensure compliance with Centris.ca's terms of service when using this tool.

***

**Note**: This scraper extracts publicly available information from Centris.ca broker profiles and listings. Always verify data accuracy and comply with applicable data usage regulations.

# Actor input Schema

## `startUrls` (type: `array`):

URLs to start with.

## `maxItems` (type: `integer`):

Limits the total number of brokers scraped to control data volume and costs.

## `addBrokerProperties` (type: `boolean`):

Toggle fetching of broker properties.

## `proxy` (type: `boolean`):

Lets you specify proxy usage (recommended for large or repeated scrapes to avoid IP bans).

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.centris.ca/en/real-estate-brokers"
    }
  ],
  "maxItems": 20,
  "addBrokerProperties": false,
  "proxy": true
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.centris.ca/en/real-estate-brokers"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("ocrad/centris-broker-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [{ "url": "https://www.centris.ca/en/real-estate-brokers" }] }

# Run the Actor and wait for it to finish
run = client.actor("ocrad/centris-broker-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.centris.ca/en/real-estate-brokers"
    }
  ]
}' |
apify call ocrad/centris-broker-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=ocrad/centris-broker-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Centris Broker Scraper",
        "description": "Extract broker details from Centris.ca, easily scrape data including broker name, phone number, properties listed and more. Perfect for real estate analysis, investment research and market insights.",
        "version": "1.0",
        "x-build-id": "nJNRbpOW5KHcmPwIk"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/ocrad~centris-broker-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-ocrad-centris-broker-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/ocrad~centris-broker-scraper/runs": {
            "post": {
                "operationId": "runs-sync-ocrad-centris-broker-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/ocrad~centris-broker-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-ocrad-centris-broker-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "URLs to start with.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "type": "integer",
                        "description": "Limits the total number of brokers scraped to control data volume and costs.",
                        "default": 20
                    },
                    "addBrokerProperties": {
                        "title": "Add Broker Properties",
                        "type": "boolean",
                        "description": "Toggle fetching of broker properties.",
                        "default": false
                    },
                    "proxy": {
                        "title": "Proxy",
                        "type": "boolean",
                        "description": "Lets you specify proxy usage (recommended for large or repeated scrapes to avoid IP bans).",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
