# Lead Formatter Tool for Cold Email (`parseforge/lead-formatter`) Actor

Clean, format, and enhance lead data using AI for cold email campaigns and lead management. Automatically standardizes names, companies, and job titles using advanced language models. Perfect for sales teams, marketers, and businesses that need properly formatted lead data for professional outreach.

- **URL**: https://apify.com/parseforge/lead-formatter.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Lead generation, AI, Automation
- **Stats:** 10 total users, 0 monthly users, 100.0% runs succeeded, 2 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 📋 Lead Formatter

> 🚀 **Clean and standardize your lead lists in seconds.** Upload a CSV or JSON file and get deduplicated, formatted leads with blank removal, job title standardization, and consistent formatting. No coding, no CRM setup required.

> 🕒 **Last updated:** 2026-04-23 · **📋 Auto-deduplication** · **🔄 Blank removal** · **🏷️ Title formatting** · **🚫 No coding** needed

The **Lead Formatter** cleans messy lead data by removing blank rows, deduplicating entries, and standardizing job titles. Upload a CSV or JSON file and get back a clean, formatted version ready for CRM import or outreach campaigns.

Built for sales teams, marketing ops, recruiters, and anyone who needs to clean up lead lists before importing to a CRM.

| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Sales teams, marketing ops, recruiters, BD teams, data ops, CRM admins | Lead cleaning, deduplication, CRM import prep, list standardization, outreach list hygiene |

---

### 📋 What the Lead Formatter does

Four data-cleaning operations:

- 🗑️ **Remove blanks.** Strip rows with empty required fields.
- 🔄 **Remove duplicates.** Deduplicate by email, name, or custom key.
- 🏷️ **Format job titles.** Standardize title casing and abbreviations.
- 📋 **Consistent output.** Clean, structured CSV or JSON ready for CRM import.

> 💡 **Why it matters:** importing messy lead data into your CRM creates duplicates, wastes outreach budget, and clutters pipelines. This Actor cleans and standardizes your list in seconds.

---

### 🎬 Full Demo

_🚧 Coming soon: a 3-minute walkthrough showing how to clean a lead list._

---

### ⚙️ Input

<table>
<thead>
<tr><th>Input</th><th>Type</th><th>Default</th><th>Behavior</th></tr>
</thead>
<tbody>
<tr><td>leadsFile</td><td>string</td><td>""</td><td>URL to your CSV or JSON leads file.</td></tr>
<tr><td>removeBlanks</td><td>boolean</td><td>true</td><td>Remove rows with empty required fields.</td></tr>
<tr><td>removeDuplicates</td><td>boolean</td><td>true</td><td>Deduplicate entries.</td></tr>
<tr><td>formatJobTitle</td><td>boolean</td><td>true</td><td>Standardize job title formatting.</td></tr>
</tbody>
</table>

**Example: clean a lead list with all options.**

```json
{
    "leadsFile": "https://example.com/leads.csv",
    "removeBlanks": true,
    "removeDuplicates": true,
    "formatJobTitle": true
}
````

***

### 📊 Output

#### 🧾 Schema

| Field | Type | Example |
|---|---|---|
| 👤 name | string | `"John Smith"` |
| 📧 email | string | `"john@example.com"` |
| 🏷️ jobTitle | string | `"Vice President of Sales"` |
| 🏢 company | string | `"Acme Corp"` |
| 📞 phone | string | `"(555) 123-4567"` |
| ✅ isDuplicate | boolean | false |
| 🕒 processedAt | ISO 8601 | `"2026-04-16T00:00:00.000Z"` |

#### 📦 Sample records

<details>
<summary><strong>✅ Clean lead after formatting</strong></summary>

```json
{
    "name": "John Smith",
    "email": "john@example.com",
    "jobTitle": "Vice President of Sales",
    "company": "Acme Corp",
    "phone": "(555) 123-4567",
    "isDuplicate": false,
    "processedAt": "2026-04-16T00:00:00.000Z"
}
```

</details>

<details>
<summary><strong>🔄 Duplicate removed</strong></summary>

```json
{
    "name": "Jane Doe",
    "email": "jane@example.com",
    "jobTitle": "Director of Marketing",
    "company": "Beta Inc",
    "phone": null,
    "isDuplicate": true,
    "processedAt": "2026-04-16T00:00:05.000Z"
}
```

</details>

<details>
<summary><strong>🏷️ Job title standardized</strong></summary>

```json
{
    "name": "Alex Chen",
    "email": "alex@startup.io",
    "jobTitle": "Chief Technology Officer",
    "company": "Startup.io",
    "phone": "(555) 987-6543",
    "isDuplicate": false,
    "processedAt": "2026-04-16T00:00:10.000Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 🗑️ | **Blank removal.** Auto-strip rows with missing required fields. |
| 🔄 | **Deduplication.** Remove duplicate entries automatically. |
| 🏷️ | **Title formatting.** Standardize job title casing and abbreviations. |
| 📋 | **CRM-ready output.** Clean CSV or JSON ready for import. |
| ⚡ | **Fast.** Process thousands of leads in seconds. |
| 🚫 | **No coding.** Upload and configure with checkboxes. |

***

### 📈 How it compares to alternatives

| Approach | Cost | Dedup | Title format | Speed | Setup |
|---|---|---|---|---|---|
| **⭐ Lead Formatter** *(this Actor)* | $5 free credit, then pay-per-use | Yes | Yes | Seconds | ⚡ 2 min |
| Excel manual cleanup | Free | Manual | Manual | Hours | N/A |
| Paid CRM enrichment tools | $50-500/month | Yes | Some | Minutes | ⏳ Hours |
| Custom scripts | Free | As coded | As coded | Varies | 🐢 Hours |

Pick this Actor when you want one-click lead list cleaning without Excel formulas or custom scripts.

***

### 🚀 How to use

1. 📝 **Sign up.** [Create a free account with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp) (takes 2 minutes).
2. 🌐 **Open the Actor.** Go to the Lead Formatter page on the Apify Store.
3. 🎯 **Set input.** Upload your leads file and toggle cleaning options.
4. 🚀 **Run it.** Click **Start**.
5. 📥 **Download.** Grab your cleaned leads from the **Dataset** tab.

> ⏱️ Total time: **2-3 minutes.** No coding required.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%" valign="top">

#### 📈 Sales & BD

- Clean prospect lists before CRM import
- Deduplicate across multiple lead sources
- Standardize titles for segmentation
- Prepare outreach lists for campaigns

</td>
<td width="50%" valign="top">

#### 📊 Marketing & Ops

- Clean event attendee lists
- Standardize webinar registrations
- Prepare newsletter imports
- Merge and deduplicate lead sources

</td>
</tr>
<tr>
<td width="50%" valign="top">

#### 🏢 Recruiting & HR

- Clean candidate databases
- Standardize job titles across sources
- Deduplicate applicant lists
- Prepare talent pool imports

</td>
<td width="50%" valign="top">

#### 🛠️ Data Operations

- Automate lead hygiene workflows
- Build ETL pipelines with clean data
- Standardize data from multiple scrapers
- Maintain CRM data quality

</td>
</tr>
</table>

***

***

### 🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Empirical datasets for papers, thesis work, and coursework
- Longitudinal studies tracking changes across snapshots
- Reproducible research with cited, versioned data pulls
- Classroom exercises on data analysis and ethical scraping

</td>
<td width="50%">

#### 🎨 Personal and creative

- Side projects, portfolio demos, and indie app launches
- Data visualizations, dashboards, and infographics
- Content research for bloggers, YouTubers, and podcasters
- Hobbyist collections and personal trackers

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Transparency reporting and accountability projects
- Advocacy campaigns backed by public-interest data
- Community-run databases for local issues
- Investigative journalism on public records

</td>
<td width="50%">

#### 🧪 Experimentation

- Prototype AI and machine-learning pipelines with real data
- Validate product-market hypotheses before engineering spend
- Train small domain-specific models on niche corpora
- Test dashboard concepts with live input

</td>
</tr>
</table>

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20Lead%20Formatter%20Tool%20for%20Cold%20Email%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20Lead%20Formatter%20Tool%20for%20Cold%20Email%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20Lead%20Formatter%20Tool%20for%20Cold%20Email%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20Lead%20Formatter%20Tool%20for%20Cold%20Email%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

### ❓ Frequently Asked Questions

<details>
<summary><b>💳 Do I need a paid Apify plan to run this actor?</b></summary>

No. You can start right now on the free Apify plan, which includes **$5 in free monthly credit**. That is enough to run this actor several times and explore the output before committing to anything. Paid plans unlock higher limits, more concurrent runs, and larger datasets. [Create a free Apify account here](https://console.apify.com/sign-up?fpr=vmoqkp) to get started.

</details>

<details>
<summary><b>🚨 What happens if my run fails or returns no results?</b></summary>

Failed runs are not charged. If the source site changes, proxies get rate-limited, or a specific input matches nothing, re-run the actor or open our [contact form](https://tally.so/r/BzdKgA) and we will investigate. You can also check the run log in the Apify console to see why the run stopped.

</details>

<details>
<summary><b>📏 How many items can I scrape per run?</b></summary>

Free users are limited to **10 items per run** so you can preview the output and confirm the actor works for your use case. Paid users can raise maxItems up to **1,000,000** per run. [Upgrade here](https://console.apify.com/sign-up?fpr=vmoqkp) if you need full scale.

</details>

<details>
<summary><b>🕒 How fresh is the data?</b></summary>

Every run fetches live data at the moment of execution. There is no cache or delay: the records you get reflect what the source returned at that moment. Schedule the actor to maintain a rolling snapshot of the data you need.

</details>

<details>
<summary><b>🧑‍💻 Can I call this actor from my own code?</b></summary>

Yes. Apify exposes every actor as a REST endpoint and ships first-class SDKs for [Node.js](https://docs.apify.com/sdk/js) and [Python](https://docs.apify.com/sdk/python). You can start a run, read the dataset, and handle webhooks from your own app in a few lines. All you need is your Apify API token.

</details>

<details>
<summary><b>📤 How do I export the data?</b></summary>

Every Apify dataset can be downloaded in one click from the console as CSV, JSON, JSONL, Excel, HTML, XML, or RSS. You can also pull results programmatically via the [Apify API](https://docs.apify.com/api/v2) or stream them into BigQuery, S3, and other destinations through built-in integrations.

</details>

<details>
<summary><b>📅 Can I schedule the actor to run automatically?</b></summary>

Yes. Use the Apify scheduler to run the actor on any cadence, from hourly to monthly. Results are saved to your dataset and can be delivered to webhooks, email, Slack, cloud storage, or automation tools such as Zapier and Make.

***

</details>

### 🔌 Automating Lead Formatter

- 🟢 **Node.js.** Install the apify-client NPM package.
- 🐍 **Python.** Use the apify-client PyPI package.
- 📚 See the [Apify API documentation](https://docs.apify.com/api/v2) for full details.

### 🔌 Integrate with any app

- [**Make**](https://docs.apify.com/platform/integrations/make) - Automate workflows
- [**Zapier**](https://docs.apify.com/platform/integrations/zapier) - Connect 5,000+ apps
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - Get notifications
- [**Airbyte**](https://docs.apify.com/platform/integrations/airbyte) - Data pipelines
- [**GitHub**](https://docs.apify.com/platform/integrations/github) - Trigger from commits
- [**Google Drive**](https://docs.apify.com/platform/integrations/drive) - Export to Sheets

***

### 🔗 Recommended Actors

- [**💼 LinkedIn Jobs Scraper**](https://apify.com/parseforge/linkedin-jobs-scraper) - Job listings
- [**📄 CV Optimizer**](https://apify.com/parseforge/cv-optimizer) - Resume optimization
- [**📢 Facebook Ads Library Scraper**](https://apify.com/parseforge/facebook-ads-library-scraper) - Ad intelligence
- [**💼 HubSpot Marketplace Scraper**](https://apify.com/parseforge/hubspot-marketplace-scraper) - App data
- [**📝 HTML to JSON Smart Parser**](https://apify.com/parseforge/html-to-json-smart-parser) - Data extraction

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more data processing tools.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) to request a new tool, propose a custom project, or report an issue.

***

> **⚠️ Disclaimer:** this Actor is an independent data processing tool. All trademarks mentioned are the property of their respective owners.

# Actor input Schema

## `leadsFile` (type: `string`):

Upload your leads file containing contact information to format and enrich. Supported formats: JSON (array of objects), XLSX/Excel (first sheet used), CSV. The tool automatically detects column names (case-insensitive) and supports variations like 'First Name', 'first\_name', 'firstName', etc. Required fields: email. Optional fields: firstName, lastName, companyName, jobTitle, country, website, personalPhone, homePhone, domain.

## `removeBlanks` (type: `boolean`):

Automatically remove leads that have no data in any field. This helps clean your dataset by filtering out completely empty records. Useful when processing large lists with incomplete data. Default: false (keeps all leads).

## `removeDuplicates` (type: `boolean`):

Remove duplicate leads based on email address. If multiple leads have the same email, only the first occurrence is kept. This ensures each contact appears only once in your final dataset. Default: false (keeps all leads including duplicates).

## `formatJobTitle` (type: `boolean`):

Format job titles with proper capitalization and professional standards. Examples: 'ceo' → 'CEO', 'vice president of sales' → 'Vice President of Sales', 'cto' → 'CTO'. Keeps abbreviations uppercase and applies title case formatting. Default: true (recommended for professional appearance).

## `formatCompanyName` (type: `boolean`):

Format company names with proper capitalization and remove business suffixes (Inc., LLC, Corp., Ltd.) for cold email personalization. Examples: 'apple inc' → 'Apple', 'microsoft corporation' → 'Microsoft'. This makes company names more personal and natural for email outreach. Default: true (recommended for cold emails).

## `formatPersonalPhone` (type: `boolean`):

Standardize personal phone numbers to a consistent format. Converts various formats (e.g., '123-456-7890', '(123) 456-7890', '1234567890') to E.164 international format (e.g., '+11234567890') or local format. Ensures phone numbers are properly formatted for CRM systems and international dialing. Default: true.

## `formatHomePhone` (type: `boolean`):

Standardize home phone numbers to a consistent format. Converts various formats to E.164 international format or local format. Ensures phone numbers are properly formatted for CRM systems. Default: true.

## `generateCompanyDescription` (type: `boolean`):

AI-generated brief company description based on available company information (name, website, domain). Useful for enriching your lead database with company context. The description helps you understand what each company does before reaching out. Default: false (enables AI content generation, increases cost slightly).

## `generatePersonalizedLine` (type: `boolean`):

AI-generated personalized opening line for cold emails based on lead information (name, company, job title). These opening lines are tailored to each lead and can significantly improve email reply rates. Example: 'Hi John, I noticed you're the CEO at Apple - I'd love to discuss...' Default: false (enables AI content generation, increases cost slightly).

## `pullCaseStudy` (type: `boolean`):

AI-generated brief case study or success story relevant to the company's industry. Useful for personalizing outreach with industry-specific examples and social proof. Helps establish credibility in your cold emails. Default: false (enables AI content generation, increases cost slightly).

## `generateIndustry` (type: `boolean`):

Automatically identify and generate the primary industry for each company. Returns 1-3 words (e.g., 'Technology', 'Healthcare', 'Financial Services'). Useful for segmentation, targeting, and personalization. Helps organize leads by industry for targeted campaigns. Default: false (enables industry identification, increases cost slightly).

## `segmentBy` (type: `array`):

Group leads into segments based on specified fields. Each segment is saved to a separate dataset for easy download and targeted campaigns. Example: \['country', 'jobTitle', 'industry'] creates segments like 'country:USA | jobTitle:CEO | industry:Technology'. Each segment gets its own dataset that you can download separately. Useful for organizing leads by location, role, or industry for targeted outreach.

## `sortBy` (type: `string`):

Sort all leads by a specific field before output. Enter the field name (e.g., 'companyName', 'country', 'jobTitle', 'industry', 'email'). Leads will be ordered alphabetically or numerically based on this field. Use with 'Sort Order' to control ascending/descending. Useful for organizing output in a specific order.

## `sortOrder` (type: `string`):

Control the sort direction when using 'Sort By'. 'asc' = ascending (A-Z, 0-9), 'desc' = descending (Z-A, 9-0). Default: 'asc' (alphabetical order).

## `fieldConfigurations` (type: `object`):

Advanced: Customize how specific fields are processed and named in output. Each field can have: 'enabled' (boolean - enable/disable field), 'customPrompt' (string - max 100 chars - custom AI instruction), 'outputColumn' (string - custom column name in output). Example: {"companyDescription": {"customPrompt": "Based on company information identify industry make it less than 3 word, do not include dashes, do not include parenthesis", "outputColumn": "Company Description"}}. Useful for customizing AI behavior and mapping fields to your CRM column names.

## `temperature` (type: `string`):

Control AI creativity and consistency (0-2). Lower values (0.1-0.3) = more consistent, deterministic results. Higher values (0.7-1.0) = more creative, varied results. Recommended: 0.3 for consistent formatting. Use higher values only if you want more creative AI-generated content. Enter as decimal (e.g., 0.3). Default: 0.3.

## `enableWebSearch` (type: `boolean`):

Enable web search to enrich lead data with real-time information from the internet. When enabled, the AI will automatically search the web for company information, industry details, case studies, and other relevant data to enhance your leads. Useful for getting up-to-date company descriptions, industry classifications, and personalized insights. Default: false (enables web search, increases processing time and cost).

## `millionVerifierApiKey` (type: `string`):

Optional: Your Million Verifier API key for email verification. If provided, each email address is verified before output. Verification status is added as 'verifiedEmail' field with values: 'valid' (deliverable), 'invalid' (undeliverable), 'unknown' (cannot verify), 'disposable' (disposable email), 'catch\_all' (catch-all domain). Helps improve email deliverability by identifying invalid emails. Get your API key from https://www.millionverifier.com/. Leave empty to skip email verification.

## Actor input object example

```json
{
  "leadsFile": "https://api.apify.com/v2/key-value-stores/rFlGez0FeHA0VgeU0/records/test-incomplete-leads.csv",
  "removeBlanks": false,
  "removeDuplicates": false,
  "formatJobTitle": true,
  "formatCompanyName": true,
  "formatPersonalPhone": true,
  "formatHomePhone": true,
  "generateCompanyDescription": false,
  "generatePersonalizedLine": false,
  "pullCaseStudy": false,
  "generateIndustry": false,
  "segmentBy": [],
  "sortOrder": "asc",
  "fieldConfigurations": {},
  "temperature": "0.3",
  "enableWebSearch": false
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "leadsFile": "https://api.apify.com/v2/key-value-stores/rFlGez0FeHA0VgeU0/records/test-incomplete-leads.csv",
    "segmentBy": [],
    "fieldConfigurations": {},
    "temperature": "0.3",
    "millionVerifierApiKey": ""
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/lead-formatter").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "leadsFile": "https://api.apify.com/v2/key-value-stores/rFlGez0FeHA0VgeU0/records/test-incomplete-leads.csv",
    "segmentBy": [],
    "fieldConfigurations": {},
    "temperature": "0.3",
    "millionVerifierApiKey": "",
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/lead-formatter").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "leadsFile": "https://api.apify.com/v2/key-value-stores/rFlGez0FeHA0VgeU0/records/test-incomplete-leads.csv",
  "segmentBy": [],
  "fieldConfigurations": {},
  "temperature": "0.3",
  "millionVerifierApiKey": ""
}' |
apify call parseforge/lead-formatter --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/lead-formatter",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Lead Formatter Tool for Cold Email",
        "description": "Clean, format, and enhance lead data using AI for cold email campaigns and lead management. Automatically standardizes names, companies, and job titles using advanced language models. Perfect for sales teams, marketers, and businesses that need properly formatted lead data for professional outreach.",
        "version": "1.0",
        "x-build-id": "18CRnHLTOUPcLUA7L"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~lead-formatter/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-lead-formatter",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~lead-formatter/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-lead-formatter",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~lead-formatter/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-lead-formatter",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "leadsFile"
                ],
                "properties": {
                    "leadsFile": {
                        "title": "Leads File",
                        "type": "string",
                        "description": "Upload your leads file containing contact information to format and enrich. Supported formats: JSON (array of objects), XLSX/Excel (first sheet used), CSV. The tool automatically detects column names (case-insensitive) and supports variations like 'First Name', 'first_name', 'firstName', etc. Required fields: email. Optional fields: firstName, lastName, companyName, jobTitle, country, website, personalPhone, homePhone, domain."
                    },
                    "removeBlanks": {
                        "title": "Remove Blank Leads",
                        "type": "boolean",
                        "description": "Automatically remove leads that have no data in any field. This helps clean your dataset by filtering out completely empty records. Useful when processing large lists with incomplete data. Default: false (keeps all leads).",
                        "default": false
                    },
                    "removeDuplicates": {
                        "title": "Remove Duplicates",
                        "type": "boolean",
                        "description": "Remove duplicate leads based on email address. If multiple leads have the same email, only the first occurrence is kept. This ensures each contact appears only once in your final dataset. Default: false (keeps all leads including duplicates).",
                        "default": false
                    },
                    "formatJobTitle": {
                        "title": "Format Job Title",
                        "type": "boolean",
                        "description": "Format job titles with proper capitalization and professional standards. Examples: 'ceo' → 'CEO', 'vice president of sales' → 'Vice President of Sales', 'cto' → 'CTO'. Keeps abbreviations uppercase and applies title case formatting. Default: true (recommended for professional appearance).",
                        "default": true
                    },
                    "formatCompanyName": {
                        "title": "Format Company Name",
                        "type": "boolean",
                        "description": "Format company names with proper capitalization and remove business suffixes (Inc., LLC, Corp., Ltd.) for cold email personalization. Examples: 'apple inc' → 'Apple', 'microsoft corporation' → 'Microsoft'. This makes company names more personal and natural for email outreach. Default: true (recommended for cold emails).",
                        "default": true
                    },
                    "formatPersonalPhone": {
                        "title": "Format Personal Phone",
                        "type": "boolean",
                        "description": "Standardize personal phone numbers to a consistent format. Converts various formats (e.g., '123-456-7890', '(123) 456-7890', '1234567890') to E.164 international format (e.g., '+11234567890') or local format. Ensures phone numbers are properly formatted for CRM systems and international dialing. Default: true.",
                        "default": true
                    },
                    "formatHomePhone": {
                        "title": "Format Home Phone",
                        "type": "boolean",
                        "description": "Standardize home phone numbers to a consistent format. Converts various formats to E.164 international format or local format. Ensures phone numbers are properly formatted for CRM systems. Default: true.",
                        "default": true
                    },
                    "generateCompanyDescription": {
                        "title": "Generate Company Description",
                        "type": "boolean",
                        "description": "AI-generated brief company description based on available company information (name, website, domain). Useful for enriching your lead database with company context. The description helps you understand what each company does before reaching out. Default: false (enables AI content generation, increases cost slightly).",
                        "default": false
                    },
                    "generatePersonalizedLine": {
                        "title": "Generate Personalized Line",
                        "type": "boolean",
                        "description": "AI-generated personalized opening line for cold emails based on lead information (name, company, job title). These opening lines are tailored to each lead and can significantly improve email reply rates. Example: 'Hi John, I noticed you're the CEO at Apple - I'd love to discuss...' Default: false (enables AI content generation, increases cost slightly).",
                        "default": false
                    },
                    "pullCaseStudy": {
                        "title": "Pull Case Study",
                        "type": "boolean",
                        "description": "AI-generated brief case study or success story relevant to the company's industry. Useful for personalizing outreach with industry-specific examples and social proof. Helps establish credibility in your cold emails. Default: false (enables AI content generation, increases cost slightly).",
                        "default": false
                    },
                    "generateIndustry": {
                        "title": "Generate Industry",
                        "type": "boolean",
                        "description": "Automatically identify and generate the primary industry for each company. Returns 1-3 words (e.g., 'Technology', 'Healthcare', 'Financial Services'). Useful for segmentation, targeting, and personalization. Helps organize leads by industry for targeted campaigns. Default: false (enables industry identification, increases cost slightly).",
                        "default": false
                    },
                    "segmentBy": {
                        "title": "Segment By",
                        "type": "array",
                        "description": "Group leads into segments based on specified fields. Each segment is saved to a separate dataset for easy download and targeted campaigns. Example: ['country', 'jobTitle', 'industry'] creates segments like 'country:USA | jobTitle:CEO | industry:Technology'. Each segment gets its own dataset that you can download separately. Useful for organizing leads by location, role, or industry for targeted outreach.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "sortBy": {
                        "title": "Sort By",
                        "type": "string",
                        "description": "Sort all leads by a specific field before output. Enter the field name (e.g., 'companyName', 'country', 'jobTitle', 'industry', 'email'). Leads will be ordered alphabetically or numerically based on this field. Use with 'Sort Order' to control ascending/descending. Useful for organizing output in a specific order."
                    },
                    "sortOrder": {
                        "title": "Sort Order",
                        "enum": [
                            "asc",
                            "desc"
                        ],
                        "type": "string",
                        "description": "Control the sort direction when using 'Sort By'. 'asc' = ascending (A-Z, 0-9), 'desc' = descending (Z-A, 9-0). Default: 'asc' (alphabetical order).",
                        "default": "asc"
                    },
                    "fieldConfigurations": {
                        "title": "Field Configurations",
                        "type": "object",
                        "description": "Advanced: Customize how specific fields are processed and named in output. Each field can have: 'enabled' (boolean - enable/disable field), 'customPrompt' (string - max 100 chars - custom AI instruction), 'outputColumn' (string - custom column name in output). Example: {\"companyDescription\": {\"customPrompt\": \"Based on company information identify industry make it less than 3 word, do not include dashes, do not include parenthesis\", \"outputColumn\": \"Company Description\"}}. Useful for customizing AI behavior and mapping fields to your CRM column names."
                    },
                    "temperature": {
                        "title": "Temperature",
                        "type": "string",
                        "description": "Control AI creativity and consistency (0-2). Lower values (0.1-0.3) = more consistent, deterministic results. Higher values (0.7-1.0) = more creative, varied results. Recommended: 0.3 for consistent formatting. Use higher values only if you want more creative AI-generated content. Enter as decimal (e.g., 0.3). Default: 0.3."
                    },
                    "enableWebSearch": {
                        "title": "Enable Web Search",
                        "type": "boolean",
                        "description": "Enable web search to enrich lead data with real-time information from the internet. When enabled, the AI will automatically search the web for company information, industry details, case studies, and other relevant data to enhance your leads. Useful for getting up-to-date company descriptions, industry classifications, and personalized insights. Default: false (enables web search, increases processing time and cost).",
                        "default": false
                    },
                    "millionVerifierApiKey": {
                        "title": "Million Verifier API Key",
                        "type": "string",
                        "description": "Optional: Your Million Verifier API key for email verification. If provided, each email address is verified before output. Verification status is added as 'verifiedEmail' field with values: 'valid' (deliverable), 'invalid' (undeliverable), 'unknown' (cannot verify), 'disposable' (disposable email), 'catch_all' (catch-all domain). Helps improve email deliverability by identifying invalid emails. Get your API key from https://www.millionverifier.com/. Leave empty to skip email verification."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
