# Work Scraper (`trudax/work-scraper`) Actor

Extract data from the top freelancing websites such as Upwork. Search by URL or search terms, filter by categories, English level, and hourly rate. Get info about freelancers and agencies without login. Download your data as an HTML table, JSON, CSV, Excel, or XML.

- **URL**: https://apify.com/trudax/work-scraper.md
- **Developed by:** [Trudax](https://apify.com/trudax) (community)
- **Categories:** Jobs, Automation, Lead generation
- **Stats:** 772 total users, 1 monthly users, 0.0% runs succeeded, 16 bookmarks
- **User rating**: No ratings yet

## Pricing

$40.00/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### What does Work Scraper do?

Work Scraper enables you to extract data from the top freelancers websites. It allows you to extract info from freelancers, jobs and agencies without login.
Here is a list of supported websites:
- Upwork

### Why scrape Work?

If you're a freelancer, you might use it to keep track of the competition or identify new opportunities. You can also automatically apply to the jobs by integrating with the [this other actor](https://apify.com/big-brain.io/upwork-application) actor and passing the `applyUrl`. Check the [integration documentation](https://docs.apify.com/integrations) for more details.

If you're an employer, you could gather data on potential freelancers, find your competitors on the platform, or make sure that your projects are targeted so as to attract the best talent.

### How much will it cost to scrape?

Work Scraper is efficient and cheap to run. It will only cost approx. $0.25 per 100 requests on top of the monthly rental fee.

### How to scrape

You can fill the search inputs in the actor or, if you want a more complex search, you can copy the url generated by your search in the origin website and use it as a `start url` in the actor inputs.

### Example results

The output of Work Scraper is stored in a dataset and will look something like this:

Freelancer profile:

```json
{
    "name": "John Dow",
    "location": "St. John's - Canada",
    "locality": "St. John's",
    "country": "Canada",
    "title": "Blockchain Developer",
    "description": "I believe highly in perfection in my work.  I have written short articles, reviews, as well as blog posts for different companies using WordPress and have done website testing as well. I am a gifted technical writer and article spinner.  I have also been a ghostwriter for multiple clients on a variety of both fiction and non-fiction writing.  I also do data entry on a daily basis into Excel books and am responsible for payroll at my full time job.  I have excellent communication skills and work as an administrative assistant on a full time basis.  I understand the need for quality work and communication to get the job done right!",
    "jobSuccess": "100%",
    "hourlyRate": "100.00",
    "totalHours": "835",
    "totalJobs": "20",
    "stats": [
        "20\nTotal Jobs",
        "835\nTotal Hours",
        "20 Total Jobs",
        "835 Total Hours",
        "20 Total Jobs",
        "835 Total Hours"
    ],
    "profileUrl": "https://www.website.com/freelancers/XXXXXX"
}
````

Job:

```json
{
    "title": "Job title",
    "description": "Job description",
    "jobType": "Hourly: $45.00",
    "contractorTier": "Intermediate",
    "skills": "Database\nDatabase Maintenance\nWeb Service\nJava\nGit\nSQLite\nCSS\nWeb Development\nHTML",
    "createdAt": "15 minutes ago",
    "scrapedAt": "2022-08-16T13:50:20.995Z",
    "url": "https://www.website.com/freelance-jobs/apply/XXXX/",
    "applyUrl": "https://www.website.com/ab/proposals/job/XXXX/apply/#/"
}
```

### Extend output function

You can use this function to update the result output of this actor. You can choose what data from the page you want to scrape. The output from this will function will get merged with the result output.

The return value of this function has to be an object!

You can return fields to archive 3 different things:

- Add a new field - Return object with a field that is not in the result output
- Change a field - Return an existing field with a new value
- Remove a field - Return an existing field with a value `undefined`

```js
async () => {
    return {
        pageTitle: document.querySelector("title").innerText,
    };
};
```

This example will add the title of the page to the final object:

```json
{
    "name": "John Dow",
    "location": "St. John's - Canada",
    "locality": "St. John's",
    "country": "Canada",
    "title": "Blockchain Developer",
    "description": "I believe highly in perfection in my work.  I have written short articles, reviews, as well as blog posts for different companies using WordPress and have done website testing as well. I am a gifted technical writer and article spinner.  I have also been a ghostwriter for multiple clients on a variety of both fiction and non-fiction writing.  I also do data entry on a daily basis into Excel books and am responsible for payroll at my full time job.  I have excellent communication skills and work as an administrative assistant on a full time basis.  I understand the need for quality work and communication to get the job done right!",
    "jobSuccess": "100%",
    "hourlyRate": "100.00",
    "totalHours": "835",
    "totalJobs": "20",
    "stats": [
        "20\nTotal Jobs",
        "835\nTotal Hours",
        "20 Total Jobs",
        "835 Total Hours",
        "20 Total Jobs",
        "835 Total Hours"
    ],
    "profileUrl": "https://www.website.com/freelancers/XXXXXX",
    "pageTitle": "John Doe - Fast, Friendly, Reliable!"
}
```

### Personal data

You should be aware that your results might contain personal data. Personal data is protected by GDPR in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers. You can also read our blog post on the [legality of web scraping](https://blog.apify.com/is-web-scraping-legal/).

# Actor input Schema

## `startUrls` (type: `array`):

URLs to start with.

## `maxItems` (type: `integer`):

How many search results should be processed

## `useLogin` (type: `boolean`):

Enable or disabled login

## `upworkLogin` (type: `string`):

You email to access your Upwork account

## `upworkPassword` (type: `string`):

You Upwork\`s account password

## `securityQuestion` (type: `string`):

Upwork may request your security question to make sure it you accessing your account

## `useBuiltInSearch` (type: `boolean`):

Use the fields below to perform a search and scrape the result

## `searchFor` (type: `string`):

Select what type of results you want. For more advanced searches, create your search on Upwork website and copy the url to use it in the scraper.

## `search` (type: `string`):

Words to be searched

## `category` (type: `string`):

Select a category to filter

## `englishLevel` (type: `string`):

Select the english level required

## `hourlyRate` (type: `string`):

Select an hourly rate

## `jobSuccess` (type: `string`):

Select a job success rate

## `earnedAmount` (type: `string`):

Select a minimul earned amount

## `billedAmount` (type: `string`):

Select a minimul earned amount

## `rhrs` (type: `boolean`):

Billed in the 6 last months only

## `extendOutputFunction` (type: `string`):

Here you can write your custom javascript code to extract custom data from the page.

## `proxy` (type: `object`):

Select proxies to be used by your crawler.

## `debugMode` (type: `boolean`):

Display detailed logs and error messages

## Actor input object example

```json
{
  "startUrls": [
    "https://www.upwork.com/search/profiles/"
  ],
  "maxItems": 1,
  "useLogin": false,
  "searchFor": "talent",
  "extendOutputFunction": "async () => {\n  return { timestamp: Date.now() }\n\n}",
  "proxy": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        "https://www.upwork.com/search/profiles/"
    ],
    "maxItems": 1,
    "searchFor": "talent",
    "extendOutputFunction": async () => {
      return { timestamp: Date.now() }
    
    },
    "proxy": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("trudax/work-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": ["https://www.upwork.com/search/profiles/"],
    "maxItems": 1,
    "searchFor": "talent",
    "extendOutputFunction": """async () => {
  return { timestamp: Date.now() }

}""",
    "proxy": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("trudax/work-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    "https://www.upwork.com/search/profiles/"
  ],
  "maxItems": 1,
  "searchFor": "talent",
  "extendOutputFunction": "async () => {\\n  return { timestamp: Date.now() }\\n\\n}",
  "proxy": {
    "useApifyProxy": true
  }
}' |
apify call trudax/work-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=trudax/work-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Work Scraper",
        "description": "Extract data from the top freelancing websites such as Upwork. Search by URL or search terms, filter by categories, English level, and hourly rate. Get info about freelancers and agencies without login. Download your data as an HTML table, JSON, CSV, Excel, or XML.",
        "version": "1.10",
        "x-build-id": "7PrwB9K2rYOlc7bmk"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/trudax~work-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-trudax-work-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/trudax~work-scraper/runs": {
            "post": {
                "operationId": "runs-sync-trudax-work-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/trudax~work-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-trudax-work-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "URLs to start with.",
                        "default": [
                            "https://www.upwork.com/search/profiles/"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Search results limit",
                        "type": "integer",
                        "description": "How many search results should be processed",
                        "default": 1
                    },
                    "useLogin": {
                        "title": "Perform logged run",
                        "type": "boolean",
                        "description": "Enable or disabled login",
                        "default": false
                    },
                    "upworkLogin": {
                        "title": "Login",
                        "type": "string",
                        "description": "You email to access your Upwork account"
                    },
                    "upworkPassword": {
                        "title": "Password",
                        "type": "string",
                        "description": "You Upwork`s account password"
                    },
                    "securityQuestion": {
                        "title": "Answer for the security question",
                        "type": "string",
                        "description": "Upwork may request your security question to make sure it you accessing your account"
                    },
                    "useBuiltInSearch": {
                        "title": "Use built-in search and ignore startUrls",
                        "type": "boolean",
                        "description": "Use the fields below to perform a search and scrape the result"
                    },
                    "searchFor": {
                        "title": "Search for",
                        "enum": [
                            "talent",
                            "freelancer",
                            "agency",
                            "job",
                            "project"
                        ],
                        "type": "string",
                        "description": "Select what type of results you want. For more advanced searches, create your search on Upwork website and copy the url to use it in the scraper.",
                        "default": "talent"
                    },
                    "search": {
                        "title": "Search",
                        "type": "string",
                        "description": "Words to be searched"
                    },
                    "category": {
                        "title": "Category",
                        "enum": [
                            "",
                            "531770282584862721",
                            "531770282580668416",
                            "531770282580668417",
                            "531770282580668420",
                            "531770282580668421",
                            "531770282584862722",
                            "531770282580668419",
                            "531770282584862723",
                            "531770282580668422",
                            "531770282584862720",
                            "531770282580668418",
                            "531770282580668423"
                        ],
                        "type": "string",
                        "description": "Select a category to filter"
                    },
                    "englishLevel": {
                        "title": "English Level",
                        "enum": [
                            "0",
                            "1",
                            "2",
                            "3",
                            "4"
                        ],
                        "type": "string",
                        "description": "Select the english level required"
                    },
                    "hourlyRate": {
                        "title": "Hourly Rate",
                        "enum": [
                            "",
                            "0-10",
                            "10-30",
                            "30-60",
                            "60"
                        ],
                        "type": "string",
                        "description": "Select an hourly rate"
                    },
                    "jobSuccess": {
                        "title": "Job Success",
                        "enum": [
                            "",
                            "80",
                            "90"
                        ],
                        "type": "string",
                        "description": "Select a job success rate"
                    },
                    "earnedAmount": {
                        "title": "Earned Amount",
                        "enum": [
                            "",
                            "1",
                            "100",
                            "1000",
                            "10000",
                            "0"
                        ],
                        "type": "string",
                        "description": "Select a minimul earned amount"
                    },
                    "billedAmount": {
                        "title": "Hours Billed",
                        "enum": [
                            "",
                            "1",
                            "100",
                            "1000"
                        ],
                        "type": "string",
                        "description": "Select a minimul earned amount"
                    },
                    "rhrs": {
                        "title": "Recent billed hours",
                        "type": "boolean",
                        "description": "Billed in the 6 last months only"
                    },
                    "extendOutputFunction": {
                        "title": "Extended Output Function",
                        "type": "string",
                        "description": "Here you can write your custom javascript code to extract custom data from the page."
                    },
                    "proxy": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Select proxies to be used by your crawler.",
                        "default": {
                            "useApifyProxy": true
                        }
                    },
                    "debugMode": {
                        "title": "Enable debug mode",
                        "type": "boolean",
                        "description": "Display detailed logs and error messages"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
