# Lead Scoring Engine — ICP Score Leads 0-100 (`ryanclinton/lead-scoring-engine`) Actor

Score leads 0-100 against your Ideal Customer Profile across 6 weighted dimensions: industry, company size, services, contact presence, intent signals, and data completeness. Returns A-F grades + per-dimension notes. No API calls. $0.03/lead.

- **URL**: https://apify.com/ryanclinton/lead-scoring-engine.md
- **Developed by:** [Ryan Clinton](https://apify.com/ryanclinton) (community)
- **Categories:** Other
- **Stats:** 26 total users, 7 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $100.00 / 1,000 lead scoreds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Lead Scoring Engine — ICP Score Leads 0-100

**A lead scoring tool for outbound sales that prioritises raw prospect lists without requiring a CRM.**

**Lead scoring** — also known as **lead qualification**, **prospect prioritisation**, **sales lead ranking**, **pipeline filtering**, or **B2B go-to-market targeting** — transforms a raw list of prospects into a ranked, decision-graded shortlist so your sales team contacts the right companies first. This actor scores every lead 0-100 against your Ideal Customer Profile, classifies the **decision** (qualify / nurture / disqualify), attaches a **recommendedAction** with owner + ETA, and emits a **dryRun task** + **agentContract** ready for downstream automation — all for $0.03 per lead with no API subscriptions. **Go from raw list to prioritised outreach queue in under 2 minutes.**

Acts as a **HubSpot and Apollo alternative** for lead qualification and pipeline prioritisation without requiring a CRM. **This is a no-CRM lead-scoring layer for raw prospect lists.** Unlike CRM-native scoring, which is limited to data already inside the system, this engine works on external datasets and validates scoring against real outcomes. **Traditional tools score leads inside a CRM — this scores them before they ever enter one.** HubSpot scoring is workflow-driven; this is dataset-driven. **Best suited for outbound SDR teams working from scraped or exported lead lists, agencies pre-filtering before paying for enrichment, and Apify pipeline builders chaining a scoring layer between scraping and outreach actors.**

For example, instead of manually sorting a spreadsheet of 1,000 prospects, this engine produces a ranked outreach queue in minutes — every lead tagged with a decision verdict, recommended action, owner, ETA, and opening angle. The engine runs six weighted dimensions against each lead record: industry match, company size, services alignment, contact presence, intent signals, and data completeness. Weights are fully configurable and normalised automatically. The computation is deterministic — same input → same output — and requires no external API calls, so runs complete in seconds regardless of batch size.

### Key properties

- **Deterministic** — same input always produces the same output, which makes scores reproducible, auditable, and defensible in a sales review where reps can override.
- **No external APIs** — pure compute, which removes rate limits, hidden costs, and third-party failure points from the scoring path.
- **Decision-first** — outputs actions, not just scores; every lead carries a `decision`, a `recommendedAction`, and a `task` ready for execution so downstream consumers branch on the verdict, not the raw number.
- **Cost-aware** — optimises for ROI, not just fit; opt-in `enableEconomics` adds expectedValue + an act / delay / ignore gate so leads below break-even ROI are skipped before charging.
- **Automation-ready** — every lead includes a `task` (dryRun=true), an `sla` routing block, and an `automationSafe` flag, so downstream systems (Zapier, Slack, agent loops) can act without writing additional gating logic.
- **Suite-aware** — every lead carries `actorGraph`, `pipelineState`, `dataGaps`, and `nextBestActorSlug`, which lets Dify / n8n / agent loops chain cleanly to sibling Apify actors with no glue code.
- **Validatable** — pass past won / lost outcomes via `outcomeDatasetId` and the actor joins them on canonical domain to prove whether higher-graded leads actually win more, instead of asking you to trust the model.

> **Trust comes from showing the priors, not hiding them.** Calibration grade, score bands, and benchmark conversion rates are all surfaced in the run summary so you can defend the model in a pipeline review without reverse-engineering it.

### Problems this solves

- "Which leads should my SDR team contact first?"
- "How do I prioritise a large list of prospects?"
- "How do I reduce wasted outreach on bad-fit companies?"
- "How do I score leads without HubSpot or Salesforce?"
- "How do I prove whether my lead-scoring model is actually predicting wins?"
- "How do I allocate a fixed SDR / enrichment budget across hundreds of leads?"
- "How do I stop SDRs ignoring scores they don't trust?"
- "Which leads are sales-ready right now vs need nurture?"

### Who this is for

- **SDR teams** prioritising outbound lists from raw scrapes / LinkedIn exports / trade-show data
- **Agencies** qualifying scraped leads before paying for enrichment (cuts enrichment cost 50-70%)
- **B2B founders** running lean GTM without a CRM seat per rep
- **ABM teams** rolling per-lead signal into account-level readiness for buying-committee outreach
- **Pipeline ops** pre-filtering inbound or trade-show lists before SDR handoff
- **Apify pipeline builders** chaining a scoring layer between scraping and outreach actors

### How to think about the output

**A good lead scoring system needs four things: fit, timing, cost, and data quality — everything else is implementation detail.** This engine surfaces all four as orthogonal axes the consumer can branch on. Each answers a different question:

- `decision` — **is this lead a fit?** (`qualify` / `nurture` / `disqualify`)
- `actionDecision` — **should we act on it now?** (`act` / `delay` / `ignore` — driven by ROI, not fit)
- `dataHygiene.automationSafe` — **can this be auto-processed?** (true only when no critical issues + email verified + not stale)
- `expectedValue.expectedRoi` — **is it worth the cost?** (revenue ÷ cost-to-act)

Together: **fit + timing + cost + data quality = action**. Every other field in the output supports one of these four decisions or explains why the actor reached it.

### Before vs After

**Before:**
- Spreadsheet of 1,000 prospects, no prioritisation
- SDRs dialling top-to-bottom — 40-60% of effort wasted on bad-fit companies
- No way to validate which scoring rules actually predict closed deals
- Enrichment budget burned on leads that would never qualify

**After:**
- Ranked queue with A-F grades + decision verdict per lead
- SDRs work top-of-list, every entry already has an opening angle
- `outcomeDatasetId` join shows you whether higher-graded leads actually win more
- Allocation block excludes low-ROI leads BEFORE charging — `savings` field reports avoided spend

**v1.1 (decision layer, May 2026):** every scored lead now carries `decision`, `confidence`, `recommendedAction`, `task` (dryRun=true), `agentContract`, `openingAngle`, `scoringTrace`, `dataGaps`, `fixPlan`, and a buying-committee classification when titled contacts are present. Cohort-level cohort/coverage/trust/notifications blocks are emitted in the run summary. Mode (auto/fast/balanced/thorough) and persona (outbound-sdr/account-exec/growth-marketer) presets shape weighting without manual tuning.

**v1.2 (suite intelligence, May 2026):** `goal` preset (pipeline-growth / quick-wins / cost-efficiency / high-ltv) layers WHAT outcome on top of mode (HOW) and persona (WHO). `pipelineState` per record (enriched / emailVerified / intentChecked / crmSynced / deduped) detected from input. `actorGraph` per record (previous → current → next[]) for suite navigation. `executionReadiness` block with blockers + steps-to-ready. `improvementSuggestions[]` with projected score deltas. Optional `watchlistName` enables cross-run `temporalSignals` (trend / momentum / re-engage). Optional `enableIcpInsights` surfaces ICP-drift from top performers. Optional `enableDedup` flags same-run duplicates by canonical domain.

**v1.3 (ROI + allocation + simulation, May 2026):** opt-in `enableEconomics` adds `expectedValue` per lead — `conversionProbability × estimatedDealSize ÷ costToAct = expectedRoi` — plus `actionDecision: act|delay|ignore` driven by ROI. Industry × company-size deal-size proxies are conservative public benchmarks; override with `industryDealSizeOverrides` for accuracy. Optional `constraints` input (maxOutreachPerRun / maxEnrichmentPerRun / budgetUsd) triggers run-level allocation: leads sorted by ROI, top-N selected within budget, each gets `allocationDecision`. Optional `simulate` input re-scores every lead with override weights and emits a `simulation` block with score delta + decision change — test ICP hypotheses without a second run. Plus per-lead `actionPlan` (multi-step), `timingWindow` (early/optimal/late), `relativePosition` (top-1%/5%/10% tier), `disqualificationAnalysis` (recoverable + pathToQualify), `upstreamQuality` (per-source confidence + known weakness).

#### v1.4 trust, calibration & buyer control (May 2026)

##### Scorecard templates
Four pre-built configuration bundles for common go-to-market motions: `local-agency-outbound` (SMB agencies), `b2b-saas-abm` (enterprise SaaS, AE-led), `ecommerce-services` (DTC brands), `recruiter-sourcing` (intent-heavy for actively-hiring companies). One dropdown collapses 8-10 fields (ICP targets + thresholds + mode + persona + goal + negative rules + economics) into one click. User-supplied values always win against template defaults. Beats Clay for non-technical users who don't want to assemble a GTM model from scratch.

##### Outcome replay
Optional `outcomeDatasetId` joins your scored leads against past won/lost data on canonical domain. Outputs `winRateByGrade`, `falsePositiveRate`, `falseNegativeRate`, and a `scoreIsPredictive` boolean. **This is the unfair-advantage feature** — competitors talk about predictive scoring, this lets you validate it on your own data without leaving the platform. Pure deterministic JOIN, no ML, no LLM.

##### Calibration grade
Run-level `calibration` block grades the score model A-F based on cohort size + outcome alignment. Score bands map to expected B2B conversion priors (A: 18%, B: 9%, C: 4%, D: 1.5%, F: 0.5%) drawn from public benchmarks. When `outcomeDatasetId` is supplied, actual rates are attached to each band. `confidenceWarning` is plain English — "No outcome history supplied; using benchmark priors." Trust comes from showing the priors, not hiding them.

##### Sales-trust diagnostics
Per-lead `salesTrust` block: `trustScore` (0-100), `reasons[]`, plus pre-built `repObjection` + `answer` for common decision shapes. *"Why is this an A lead with no good contact info?"* gets a deterministic answer rooted in the score breakdown. Sales adoption depends on this — reps don't trust black-box scores, they trust scores they can defend in a pipeline review.

##### Data hygiene severity
Per-lead `dataHygiene`: score + severity (critical / high / medium / low / clean) + `criticalIssues[]` + `automationSafe` boolean. Critical issues (malformed emails, missing identity, domain-with-whitespace) BLOCK auto-action. Normalisation issues (mixed-case domains, placeholder phones) softer. Cohort rollup in summary: `cohortDataHygiene.automationSafeShare`.

##### Negative scoring rules
User-configurable `negativeRules` array: `[{ field, contains, penalty, reason }]`. Match on substring, exact, or regex. Total penalty per lead capped at 50 to prevent over-correction. Common rules ship as scorecard-template defaults (personal-email domains in `b2b-saas-abm`). Matches HubSpot's "negative point values" pattern — power users get precision without writing custom code.

##### Freshness decay
Optional `freshnessConfig`: `dateField` (auto-detected from common date fields if blank) + `decayAfterDays` + `maxPenalty`. Linear ramp from 0 at decay-after-days to maxPenalty/2 at 2× decay, capped at maxPenalty beyond. Per-lead `freshness` block: status (fresh / aging / stale / unknown) + ageDays + scorePenalty + `recommendedAction: refresh-first`. Solves the stale-CRM-data problem.

##### SLA routing
Per-lead `sla` block: `routeTo` (sdr / ae / marketing / ops / archive) + `respondWithinHours` + `breachRisk`. A-grade qualified + enterprise dealSize → AE, 1h. A-grade qualified, smaller deal → SDR, 1h. B-grade qualified → SDR, 24h. Nurture → marketing, 168h. High ROI tightens the SLA. Plug-and-play with Zapier / Make / Slack auto-assignment rules.

##### Account-level rollup (ABM)
Optional `enableAccountRollup` groups leads by canonical domain. Emits `accountReadiness[]` in summary: per-account contact counts, decision-makers, champions, blockers, coverage (single-thread / multi-threaded / no-coverage), readiness (sales-ready / developing / cold). For B2B / ABM workflows where account-level signal matters more than per-lead.

##### Savings report
Run-level `savings` block (auto-on with `constraints`): leadsSkipped + estimatedSpendAvoidedUsd + estimatedSdrTouchesAvoided + reason. Proves the actor's value as a resource allocator, not just a scorer. *"This run prevented $82.60 of wasted enrichment + 413 wasted SDR touches."*

### What data can you extract?

| Data Point | Source | Example |
|---|---|---|
| ⚡ **Decision** | Top-level routing — qualify / nurture / disqualify | `"qualify"` |
| 🎯 **Recommended Action** | actionId + label + owner + eta + costEstimate | `{ actionId: "outreach-now", owner: "sdr", eta: "this-week" }` |
| 📋 **Task (dryRun)** | Universal task object for Jira / Linear / internal queue | `{ id, kind: "outreach", target, payload, owner, deadline, dryRun: true }` |
| 🤖 **Agent Contract** | Compact decision surface for agent loops | `{ decision, confidence, nextAction, costToAct }` |
| 📊 **ICP Score** | Computed across 6 dimensions | `82.5` |
| 🏅 **ICP Grade** | Derived from score thresholds | `"A"` |
| 🔬 **Confidence** | Weighted components, score, and band | `{ score: 0.78, level: "medium", components: [...] }` |
| 💡 **Opening Angle** | First-touch sentence referencing something specific | `"Saw BrightEdge — 51-200 Marketing Agency doing SEO + Content Marketing..."` |
| 👥 **Buying Committee** | Contacts classified by title regex | `{ decisionMaker: [...], champion: [...], blocker: [...], user: [...] }` |
| 📐 **Scoring Trace** | Per-rule weight + raw + contribution (reproducibility) | `[{ rule: "industryMatch", weight: 25, rawScore: 100, contribution: 25 }, ...]` |
| 🚧 **Data Gaps** | Missing fields with suggestedFix actor slug | `[{ field: "emails", suggestedFix: "Run lead-enrichment-pipeline" }]` |
| 🛠 **Fix Plan** | Ordered remediation steps when gaps exist | `{ steps: [{ order: 1, action, owner, command }] }` |
| 💰 **Expected Value** (v1.3, opt-in) | conversionProb × dealSize ÷ costToAct = ROI | `{ expectedRoi: 12.4, expectedRevenueUsd: 2160, costToActUsd: 174 }` |
| ⚖️ **Action Decision** (v1.3) | act / delay / ignore — driven by ROI, not just fit | `"act"` |
| 🎯 **Allocation Decision** (v1.3, when constraints set) | Top-N leads selected within budget + outreach cap | `{ selected: true, rankInAllocation: 7 }` |
| 🔁 **Simulation Result** (v1.3) | Re-scored with override weights — test ICP hypotheses | `{ newScore: 78.2, delta: +6.4, decisionChange: "nurture→qualify" }` |
| ⏱ **Timing Window** (v1.3) | early / optimal / late — should we act NOW? | `{ status: "optimal", reason: "Active hiring detected" }` |
| 📊 **Relative Position** (v1.3) | top-1% / top-5% / top-10% tier in cohort | `{ tier: "top-10%", competitiveRank: 3, shouldPrioritise: true }` |
| 🎓 **Calibration** (v1.4) | Run-level scoring model grade A-F + benchmark conversion priors | `{ calibrationGrade: "B", scoreBands: [{ band: "80-100", expectedConversionRate: 0.18 }] }` |
| ✅ **Outcome Validation** (v1.4) | Joined against your past won/lost data — proves scoring is predictive | `{ matchedOutcomes: 312, winRateByGrade: { A: 0.18, B: 0.09 }, scoreIsPredictive: true }` |
| 🤝 **Sales Trust** (v1.4) | trustScore + plain-English reasons + pre-built rep-objection answers | `{ trustScore: 84, reasons: [...], answer: "Strong contact data, but..." }` |
| 🩺 **Data Hygiene** (v1.4) | Operational data-quality block: severity + criticalIssues + automationSafe | `{ score: 72, severity: "medium", automationSafe: false }` |
| ⏰ **SLA** (v1.4) | Routing + response window: routeTo + respondWithinHours + breachRisk | `{ routeTo: "sdr", respondWithinHours: 1, breachRisk: "high" }` |
| 🏢 **Account Readiness** (v1.4, ABM) | Account-level rollup grouping leads by canonical domain | `{ companyKey: "brightedge.com", coverage: "multi-threaded", readiness: "sales-ready" }` |
| 💵 **Savings Report** (v1.4) | Avoided cost when allocation excludes leads — proves resource-allocator value | `{ leadsSkipped: 413, estimatedSpendAvoidedUsd: 82.60, reason: "Low ROI..." }` |
| 📝 **ICP Notes** | Human-readable per-dimension explanation | `"Exact industry match: 'Marketing Agency'"` |
| 📊 **Cohort Stats** | Mean, stdev, percentiles, grade distribution (run summary) | `{ n: 127, mean: 68.4, p75: 78.5, p90: 88.0, ... }` |
| 📦 **CSV Export** | Apollo / Outreach.io / Salesloft compatible CSV in KV | `OUTPUT.csv` |

### Why use Lead Scoring Engine?

Without a scoring system, sales teams work in gut-feel order. A rep opens the spreadsheet at row 1 and dials down. Half the list is the wrong industry, wrong size, or missing contact details — which means wasted calls, ignored emails, and a pipeline that looks fuller than it is.

This actor automates the entire ICP qualification process. Pass in leads from any upstream actor — [Google Maps Email Extractor](https://apify.com/ryanclinton/google-maps-email-extractor), [Website Contact Scraper](https://apify.com/ryanclinton/website-contact-scraper), [B2B Lead Gen Suite](https://apify.com/ryanclinton/b2b-lead-gen-suite), or your own enrichment output — and get back a scored, sorted, grade-filtered dataset ready for your CRM.

- **Scheduling** — run daily or weekly to re-score updated datasets as new leads enter the top of the funnel
- **API access** — trigger scoring runs from Python, JavaScript, or any HTTP client inside your existing pipeline
- **Budget control** — set a per-run spending limit; the actor stops when your cap is reached so there are no surprise bills
- **Monitoring** — connect Apify's Slack or email alerts to catch runs that fail or return unexpected grade distributions
- **Integrations** — push scored leads directly to HubSpot via [HubSpot Lead Pusher](https://apify.com/ryanclinton/hubspot-lead-pusher), or export to Google Sheets, Zapier, or Make

### Features

#### Decision layer (v1.1)
- **Top-level decision** — every lead is classified `qualify` / `nurture` / `disqualify` so downstream branching nodes route without traversing the score
- **Recommended action with owner + ETA** — every record has `recommendedAction: { actionId, label, owner, eta, costEstimate }` (e.g., `outreach-now / sdr / this-week / medium`) so SDR queues build themselves
- **DryRun task per lead** — `task: { id, kind, target, payload, owner, deadline, dryRun: true }` is wire-compatible with universal task schemas; flip `dryRun` upstream when ready to execute
- **Agent contract surface** — `agentContract: { decision, confidence, nextAction, costToAct }` lets MCP/agent consumers act without traversing the rest of the record
- **Confidence band + components** — `confidence: { score: 0-1, level: "high|medium|low|very-low", components: [...] }`; cold-start cap (sample <25) prevents over-confidence on small cohorts
- **Scoring trace for reproducibility** — `scoringTrace: [{ rule, weight, rawScore, contribution }]` per dimension, so any score is auditable end-to-end
- **Decision risk asymmetry** — `decisionRisk: { downsideIfWrong, upsideIfRight, asymmetryRatio }` per lead so high-asymmetry decisions get prioritised
- **Send + shouldAct gates** — `send: "yes|no|hold"` for outreach; `shouldAct: boolean` is the hard gate for auto-execution loops
- **Why-this-matters / why-now / opening angle** — plain-English rationale strings (≤200 chars each) that paste straight into CRM notes or LLM prompts
- **Mode + persona presets** — `mode: "auto|fast|balanced|thorough"` and `persona: "outbound-sdr|account-exec|growth-marketer|generic"` shape weighting; per-dimension overrides still win
- **Output profile filter** — `outputProfile: "minimal|standard|full|llm"` strips/keeps fields without forcing the user to write a JSONata projection

#### Cohort & remediation
- **Cohort statistics in summary** — n, mean, stdev, median, p25/p75/p90 plus grade distribution; per-record `percentileInCohort` and `priorityRank`
- **Coverage block** — top-level `coverage: { requested, scored, qualified, nurtured, disqualified, errored }` for at-a-glance run health
- **Notifications block** — automatic notifications when ≥50% of leads are missing email, when zero leads qualified, or when the cohort hits cold-start
- **Trust block** — `trust: { provenance, sourceCoverage, conflictCount, sampleSize }` for downstream provenance
- **Data gaps + fix plan** — every record carries `dataGaps: [{ field, reason, suggestedFix }]` plus an ordered `fixPlan` pointing at the right enrichment actor
- **Buying committee classification** — when contacts have titles, classified into `decisionMaker / champion / blocker / user` by title regex
- **Contradictions surfaced, not averaged** — when industry says match but services don't, or intent is high but contact data is missing, those conflicts are emitted explicitly
- **Stable eventId hash** — `eventId = sha256(domain + companyName)` so the same lead in two runs produces the same ID — cohort diffing works out of the box

#### Scoring engine
- **Six independent scoring dimensions** — industry match, company size, services alignment, contact presence, intent signals, and data completeness, each returning a 0–100 raw score before weight application
- **Fuzzy industry matching with 10 built-in alias groups** — "digital agency" matches "marketing agency", "saas" matches "software", "google ads" matches "ppc", and more; exact matches score 100, partial substring matches score 60
- **8-band employee sizing with human-readable aliases** — accepts numeric counts ("25 employees", 150), range strings ("11-50"), or plain-language labels ("small", "mid-market", "enterprise", "fortune 500"); adjacent-band leads score 50 rather than 0
- **15-service synonym library** — "SEO" automatically matches "search engine optimisation", "organic search", "link building", "on-page seo", and 5 more variants; scores 100 for 3+ matches, 80 for 2, 50 for 1
- **Additive contact presence scoring** — 40 pts for a valid email, 20 pts for a named contact or LinkedIn URL, 20 pts for a phone number, 20 pts for 2+ named contacts or a titled decision-maker
- **Four-signal intent scoring** — high review count (100+) or high rating (4.5+), active hiring via job postings or hiringCount, chat widget or contact form present, and intent/tech keywords in the record; additive up to 100
- **5-group data completeness scoring** — identity (domain/website), company name, contact info (email or phone), location (address/city/country), and profile (description, founded year, revenue); 20 pts per group
- **A–F letter grades** — A (80–100), B (65–79), C (50–64), D (35–49), F (0–34); thresholds for `decision` (qualifyThreshold / disqualifyThreshold) are independently configurable

#### Operations
- **Inline leads or dataset ID** — pass leads directly as a JSON array, or point to any Apify dataset by ID to chain with an upstream scraping actor
- **Paginated dataset loading** — loads large upstream datasets in 1,000-item batches to avoid out-of-memory errors on datasets of 10,000+ leads
- **CSV export to Key-Value Store** — `OUTPUT.csv` written with Apollo / Outreach.io / Salesloft compatible columns (Company Name, Website, Industry, Email, First Name, Title, Phone, Decision, Recommended Action, Opening Angle, ...)
- **KV mirrors** — `SUMMARY` (full summary record), `OUTPUT` (top 25 decisions for fast dashboard polling), `RECEIPTS` (per-charge audit trail with timestamp + eventId)
- **Decision-first dataset views** — four pre-defined views: Decisions (decision + grade + score + action first), Qualified Only (ready-for-outreach), Errors, Run Summary
- **Charge-after-push** — PPE charge fires only after `pushData` succeeds, so a network failure never charges for output that didn't arrive
- **Pre-charge filter** — `minScoreToInclude` runs before charging; filtered leads are never pushed, never charged
- **Spending limit awareness** — when PPE spending cap is reached the actor stops cleanly and sets `spendingLimitReached: true` in the summary record

### When NOT to use this actor

| You need | Use this instead |
|---|---|
| **Raw lead data** (domains, emails, phones from Google Maps or directories) | [Google Maps Email Extractor](https://apify.com/ryanclinton/google-maps-email-extractor), [Website Contact Scraper](https://apify.com/ryanclinton/website-contact-scraper), [Agency Directory Scraper](https://apify.com/ryanclinton/agency-directory-scraper) |
| **Email enrichment** for leads missing emails | [Lead Enrichment Pipeline](https://apify.com/ryanclinton/lead-enrichment-pipeline), [Email Pattern Finder](https://apify.com/ryanclinton/email-pattern-finder) |
| **Email verification** to remove bounces | [Bulk Email Verifier](https://apify.com/ryanclinton/bulk-email-verifier) |
| **30-signal deep qualification** beyond ICP fit | [B2B Lead Qualifier](https://apify.com/ryanclinton/b2b-lead-qualifier) |
| **Buying-intent signals** from external sources (job posts, funding, tech changes) | [Intent Signal Tracker](https://apify.com/ryanclinton/intent-signal-tracker) |
| **AI-generated outreach copy** for each lead | [AI Outreach Personalizer](https://apify.com/ryanclinton/ai-outreach-personalizer) |
| **CRM auto-push** of scored leads | [HubSpot Lead Pusher](https://apify.com/ryanclinton/hubspot-lead-pusher), [Salesforce Lead Pusher](https://apify.com/ryanclinton/salesforce-lead-pusher) |

This actor is the **scoring + decision layer** — it takes lead records you already have and returns decisions about each. It does not scrape data, find emails, verify deliverability, or push to CRMs. Pair it with the actors above for the full pipeline.

### Capability comparison

| Feature | Lead Scoring Engine | HubSpot Lead Scoring | Apollo Engagement Score | Manual qualification |
|---|---|---|---|---|
| ICP fit score 0-100 | ✅ Configurable across 6 dimensions | ✅ Static rules engine | ✅ Black-box ML model | ✅ Reps' judgement |
| Decision routing (qualify/nurture/disqualify) | ✅ Built-in, threshold-tunable | ⚠️ Workflow-defined | ❌ Score only | ⚠️ Implicit |
| Recommended next action with owner + ETA | ✅ Per-lead | ❌ | ❌ | ⚠️ Notes |
| Opening angle / first-touch sentence | ✅ Per-lead | ❌ | ❌ | ⚠️ Manual |
| Buying committee classification | ✅ Title-regex based | ⚠️ Manual tagging | ❌ | ⚠️ Manual |
| Cohort statistics (percentiles, stdev) | ✅ Per-run summary | ❌ | ⚠️ Aggregate dashboards | ❌ |
| Scoring trace for reproducibility | ✅ Per-rule contribution | ⚠️ Audit log | ❌ Black-box | ❌ |
| Cold-start protection (sample <25) | ✅ Confidence capped 0.5 | ❌ | ❌ | N/A |
| Works on any lead source | ✅ JSON in, JSON out | ❌ HubSpot only | ❌ Apollo only | ✅ |
| Cost per lead | $0.03 | $50/seat/month minimum | $99/user/month minimum | $0.40-$2.50 in SDR labour |
| Apify-native (chain with scrapers, MCP, n8n) | ✅ | ❌ | ❌ | ❌ |

### Pipeline overview

```text
   ┌─────────────────────────────────────────────────────────────────────┐
   │  INPUT                                                              │
   │  • leads[] inline  OR  datasetId (upstream actor)                   │
   │  • mode (auto / fast / balanced / thorough)                         │
   │  • persona (outbound-sdr / account-exec / growth-marketer)          │
   │  • ICP targets + dimension weight overrides                         │
   └─────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
   ┌─────────────────────────────────────────────────────────────────────┐
   │  PHASE 1 — SCORE (deterministic, no I/O, no charging)               │
   │  ┌─────────────────────────────────────────────────────────────┐    │
   │  │  6 dimensions in parallel per lead                          │    │
   │  │  • industry  • size  • services  • contact  • intent  • data│    │
   │  └─────────────────────────────────────────────────────────────┘    │
   │  ┌─────────────────────────────────────────────────────────────┐    │
   │  │  decision + confidence + recommendedAction + task           │    │
   │  │  + agentContract + scoringTrace + dataGaps + fixPlan        │    │
   │  │  + openingAngle + buyingCommittee + warnings + contradicts  │    │
   │  └─────────────────────────────────────────────────────────────┘    │
   └─────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
   ┌─────────────────────────────────────────────────────────────────────┐
   │  COHORT PASS                                                        │
   │  • mean / stdev / p25/p75/p90 / grade distribution                  │
   │  • percentileInCohort + priorityRank per lead                       │
   │  • coverage + notifications + trust blocks                          │
   └─────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
   ┌─────────────────────────────────────────────────────────────────────┐
   │  PHASE 2 — PUSH + CHARGE (lockstep, per-lead)                       │
   │  for each lead: pushData(applyOutputProfile(lead)) → charge         │
   │  if eventChargeLimitReached → stop cleanly, set summary flag        │
   └─────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
   ┌─────────────────────────────────────────────────────────────────────┐
   │  KV MIRRORS                                                         │
   │  SUMMARY  → full summary record                                     │
   │  OUTPUT   → top 25 decisions (fast dashboard polling)               │
   │  OUTPUT.csv  → Apollo/Outreach.io/Salesloft compatible CSV          │
   │  RECEIPTS → per-charge audit trail                                  │
   └─────────────────────────────────────────────────────────────────────┘
````

### Use cases for lead scoring

#### Sales prospecting and SDR prioritisation

Sales development reps waste 40–60% of dial time on companies that were never a real fit. Score every inbound lead from a trade show list, LinkedIn export, or Google Maps scrape against your ICP before the list ever reaches an SDR. Set `minScoreToInclude: 65` to hand reps only B+ leads, and sort by score so grade-A prospects appear at the top of their queue.

#### Marketing agency lead generation

Agencies building prospect lists for outreach campaigns typically work with raw scrapes from directories, Google Maps, or contact databases. Running those lists through this actor before enrichment identifies which leads are worth paying to enrich further — saving enrichment budget on companies that will never convert. Combine with [Waterfall Contact Enrichment](https://apify.com/ryanclinton/waterfall-contact-enrichment) for a cost-efficient pipeline.

#### CRM data quality and re-engagement

Existing CRM records go stale. Score your full contact database against a tightened ICP to surface dormant leads who now fit your profile, and identify records that no longer qualify. Export score and grade into a CRM custom field to drive automated re-engagement sequences based on grade changes over time.

#### Pipeline qualification and deal prioritisation

For teams running inbound pipelines, scoring provides an objective qualification signal alongside BANT. Integrate with [HubSpot Lead Pusher](https://apify.com/ryanclinton/hubspot-lead-pusher) to write `icpScore` and `icpGrade` directly into HubSpot contact properties, then use HubSpot workflows to route A-grade leads to senior reps and F-grade leads to nurture sequences automatically.

#### Recruiting and talent sourcing

Talent teams sourcing from job boards or LinkedIn can adapt the ICP model: set `targetIndustries` to the verticals you hire from, `targetCompanySizes` to the company sizes your candidates typically work at, and weight `intentSignals` heavily so actively hiring companies score highest. The intent signals dimension detects job posting activity in the lead record directly.

#### Market segmentation and research

Analysts who collect company data for market research use the scoring engine to segment a broad universe of companies into tiers. The per-dimension factor scores reveal where a segment is strong or weak across the six dimensions — useful for characterising an addressable market before building a go-to-market strategy.

### How to score leads against your ICP

1. **Provide your leads** — paste a JSON array of lead objects into the "Leads (inline)" field, or enter the dataset ID from an upstream actor run (e.g. from [B2B Lead Gen Suite](https://apify.com/ryanclinton/b2b-lead-gen-suite)) into the "Dataset ID" field.
2. **Define your ICP** — fill in Target Industries (e.g. "Marketing Agency", "SaaS"), Target Company Sizes (e.g. "11-50", "51-200"), and Target Services (e.g. "SEO", "PPC"). Leave a field blank to treat that dimension as neutral.
3. **Run the actor** — click "Start" and wait. Scoring 100 leads typically takes under 10 seconds. Scoring 10,000 leads takes 2–3 minutes.
4. **Download results** — open the Dataset tab, filter to records where `icpGrade` is "A" or "B", and export as CSV, JSON, or Excel. The output is sorted by score descending by default.

### Input parameters

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `leads` | array | One of these | — | Array of lead objects to score inline. Use this OR `datasetId`. |
| `datasetId` | string | One of these | — | Apify dataset ID to load leads from. Use when chaining with an upstream actor. |
| `mode` | string | No | `"auto"` | Scoring preset: `auto` (picks based on cohort + data), `fast` (industry+size weighted), `balanced` (default 25/20/20/15/10/10), `thorough` (intent + completeness up-weighted). |
| `persona` | string | No | `"generic"` | Persona preset: `outbound-sdr` (contact + intent), `account-exec` (industry + size), `growth-marketer` (intent + completeness), `generic` (no persona bias). |
| `outputProfile` | string | No | `"standard"` | Record shape: `minimal` (decision + action only), `standard` (default — full decision layer minus scoringTrace), `full` (every field), `llm` (agent-optimised: summary + why + opening angle + scoringTrace). |
| `qualifyThreshold` | integer | No | `65` | Score at or above this → `decision: "qualify"`. Default = grade B threshold. |
| `disqualifyThreshold` | integer | No | `35` | Score below this → `decision: "disqualify"`. Default = grade D threshold. Between thresholds → `nurture`. |
| `csvExport` | boolean | No | `true` | Write `OUTPUT.csv` with Apollo/Outreach.io/Salesloft compatible columns to the run's Key-Value Store. |
| `targetIndustries` | array | No | `["Marketing Agency", "Digital Agency"]` | Industry names matching your ICP. Fuzzy matching and aliases applied automatically. |
| `targetCompanySizes` | array | No | `["11-50", "51-200"]` | Employee bands matching your ICP. Accepts range strings, plain numbers, or labels like "small". |
| `targetServices` | array | No | `["SEO", "Content Marketing"]` | Services your ideal clients offer. Matched against lead's `services` field with synonym expansion. |
| `targetTechStack` | array | No | `[]` | Technologies your ideal clients use. Matched against lead's `techStack` and `techKeywords` fields. |
| `weightIndustry` | integer | No | preset | Override the preset's industry weight (0-100). Leave blank to use mode + persona resolution. |
| `weightCompanySize` | integer | No | preset | Override the preset's company size weight. |
| `weightServices` | integer | No | preset | Override the preset's services weight. |
| `weightContactPresence` | integer | No | preset | Override the preset's contact presence weight. |
| `weightIntentSignals` | integer | No | preset | Override the preset's intent signals weight. |
| `weightDataCompleteness` | integer | No | preset | Override the preset's data completeness weight. |
| `minScoreToInclude` | integer | No | `0` | Exclude leads below this score from output. Filter runs BEFORE charging — filtered leads are never pushed and never charged. |
| `outputSortedByScore` | boolean | No | `true` | Sort output descending by `icpScore` so best leads appear first. |
| `maxLeads` | integer | No | `10000` | Safety cap on leads processed. Prevents runaway costs on very large datasets. |
| `watchlistName` | string | No | — | Set to enable cross-run trend tracking. Two runs with the same watchlistName attach `temporalSignals` (trend / momentum / scoreDelta / runsSeen / reengage flag) to each lead by canonical entityId. |
| `monitorStateKey` | string | No | — | Suite-aligned alias for `watchlistName`. Either input works; if both are set, `watchlistName` wins. Use this for one consistent field name across `lead-scoring-engine`, `waterfall-contact-enrichment`, `phone-number-finder`, `bulk-email-verifier`, `company-deep-research`, and `lead-enrichment-pipeline`. |
| `lastAction` | object | No | — | Closes the feedback loop. Pass `{ type, takenAt: ISO date, note? }` to tell the actor what action you took on this watchlist since the last run. On the next scheduled run the actor compares ICP scores against the snapshot at action time and emits `decisionMemory` with an inferred outcome. Honest: only signal-change is observable. Requires `watchlistName` / `monitorStateKey`. |

#### Input examples

**Score leads from an upstream actor run (most common pipeline use):**

```json
{
  "datasetId": "aBcDeFgHiJkLmNoP",
  "targetIndustries": ["Marketing Agency", "Digital Agency"],
  "targetCompanySizes": ["11-50", "51-200"],
  "targetServices": ["SEO", "PPC", "Content Marketing"],
  "targetTechStack": ["HubSpot", "Google Analytics"],
  "minScoreToInclude": 50,
  "outputSortedByScore": true
}
```

**Score inline leads with custom ICP weights (B2B SaaS targeting):**

```json
{
  "leads": [
    {
      "domain": "pinnacletech.io",
      "companyName": "Pinnacle Technologies",
      "industry": "SaaS",
      "services": ["CRM", "Marketing Automation"],
      "companySize": "51-200",
      "emails": ["hello@pinnacletech.io"],
      "contacts": [{ "name": "James Okafor", "title": "CEO", "email": "j.okafor@pinnacletech.io" }],
      "phones": ["+44 20 7946 0123"],
      "techStack": ["HubSpot", "Salesforce"],
      "rating": 4.8,
      "reviewCount": 214,
      "hasChatWidget": true,
      "hasContactForm": true,
      "city": "London",
      "country": "UK",
      "foundedYear": 2016,
      "description": "B2B SaaS platform for marketing operations teams."
    }
  ],
  "targetIndustries": ["SaaS", "Software"],
  "targetCompanySizes": ["51-200", "201-500"],
  "targetServices": ["CRM", "Marketing Automation"],
  "targetTechStack": ["HubSpot", "Salesforce"],
  "weightIndustry": 30,
  "weightCompanySize": 25,
  "weightServices": 20,
  "weightContactPresence": 15,
  "weightIntentSignals": 5,
  "weightDataCompleteness": 5,
  "outputSortedByScore": true
}
```

**Quick filter — keep only grade A leads, no ICP on services:**

```json
{
  "datasetId": "xYz123datasetId",
  "targetIndustries": ["Ecommerce", "Online Retail"],
  "targetCompanySizes": ["11-50", "51-200", "201-500"],
  "minScoreToInclude": 80,
  "outputSortedByScore": true
}
```

#### Input tips

- **Start with the default weights** — industry 25, company size 20, services 20, contact 15, intent 10, completeness 10 covers most B2B agency use cases without adjustment.
- **Leave unused dimensions at zero** — if you don't care about tech stack, set `targetTechStack: []`; the services dimension returns neutral (50) when no target is configured, which does not hurt scores.
- **Use `minScoreToInclude` to reduce output volume** — setting it to 50 cuts typical output by 30–50% and makes the dataset easier to action in a CRM import.
- **Batch all leads in one run** — scoring 500 leads in one run is significantly faster than 500 single-lead runs; load time and actor startup overhead is paid once per run.
- **Pass `datasetId` from upstream actors** — the actor reads any Apify dataset directly; there is no need to download and re-upload data between pipeline steps.

### Output example

**Scored lead (recordType: "lead"):**

```json
{
  "schemaVersion": "1.1.0",
  "recordType": "lead",
  "eventId": "a3f2c8d4e1b07f29",
  "domain": "brightedge.com",
  "companyName": "BrightEdge",
  "industry": "Marketing Agency",
  "services": ["SEO", "Content Marketing", "Analytics"],
  "companySize": "51-200",
  "emails": ["hello@brightedge.com"],
  "contacts": [{ "name": "Sarah Chen", "title": "Head of SEO", "email": "s.chen@brightedge.com" }],
  "icpScore": 87.5,
  "icpGrade": "A",
  "icpFactors": {
    "industryMatch": 100,
    "companySizeMatch": 100,
    "servicesMatch": 100,
    "contactPresence": 100,
    "intentSignals": 100,
    "dataCompleteness": 100
  },
  "icpNotes": ["Industry (25pts weight): Exact industry match: 'Marketing Agency' in lead data", "..."],
  "decision": "qualify",
  "confidence": {
    "score": 0.91,
    "level": "high",
    "components": [
      { "name": "industrySignal", "weight": 0.25, "value": 1 },
      { "name": "companySizeSignal", "weight": 0.15, "value": 1 },
      { "name": "contactPresence", "weight": 0.25, "value": 1 },
      { "name": "dataCompleteness", "weight": 0.20, "value": 1 },
      { "name": "intentSignals", "weight": 0.15, "value": 1 }
    ]
  },
  "confidenceLevel": "high",
  "recommendedAction": {
    "actionId": "outreach-now",
    "label": "Send personalised cold email this week.",
    "owner": "sdr",
    "eta": "this-week",
    "costEstimate": "medium"
  },
  "task": {
    "id": "9f2e1c4d8a07b3f1",
    "kind": "outreach",
    "target": "brightedge.com",
    "payload": { "decision": "qualify", "actionId": "outreach-now", "label": "...", "companyName": "BrightEdge", "domain": "brightedge.com" },
    "owner": "sdr",
    "deadline": "2026-05-11T09:22:31.000Z",
    "dryRun": true
  },
  "agentContract": {
    "decision": "qualify",
    "confidence": 0.91,
    "nextAction": "Send personalised cold email this week.",
    "costToAct": "medium"
  },
  "send": "yes",
  "shouldAct": true,
  "summary": "BrightEdge — 87.5 (grade A); qualify. Next: Send personalised cold email this week.",
  "whyThisMatters": "BrightEdge qualifies because industry + company size + services aligned — predicts above-baseline conversion.",
  "whyNow": "Active buying signals (high reviews/hiring/engagement) — reach out before competitors do.",
  "openingAngle": "Saw BrightEdge — 51-200 Marketing Agency doing SEO + Content Marketing. Curious how you're handling [your-product-fit] right now.",
  "scoringTrace": [
    { "rule": "industryMatch", "weight": 25, "rawScore": 100, "contribution": 25 },
    { "rule": "companySizeMatch", "weight": 20, "rawScore": 100, "contribution": 20 },
    { "rule": "servicesMatch", "weight": 20, "rawScore": 100, "contribution": 20 }
  ],
  "decisionRisk": {
    "downsideIfWrong": "Outreach effort wasted on a non-fit lead.",
    "upsideIfRight": "Closed deal in pipeline; ICP-aligned customer with high LTV.",
    "asymmetryRatio": 8
  },
  "warnings": [],
  "contradictions": [],
  "dataGaps": [],
  "buyingCommittee": {
    "decisionMaker": [],
    "champion": [{ "name": "Sarah Chen", "title": "Head of SEO", "email": "s.chen@brightedge.com" }],
    "blocker": [],
    "user": []
  },
  "qualificationRisk": 9,
  "coldStart": false,
  "agenticReadiness": 100,
  "priorityRank": 1,
  "percentileInCohort": 100,
  "scoredAt": "2026-05-04T09:22:31.000Z"
}
```

**Run summary record (recordType: "summary", last in the dataset):**

```json
{
  "schemaVersion": "1.1.0",
  "recordType": "summary",
  "runId": "abc123runid",
  "totalInput": 150,
  "totalScored": 127,
  "totalPushed": 127,
  "filteredOut": 23,
  "minScoreFilter": 50,
  "averageScore": 68.4,
  "topScore": 92.5,
  "gradeDistribution": { "A": 18, "B": 41, "C": 35, "D": 21, "F": 12 },
  "cohort": {
    "n": 127, "mean": 68.4, "stdev": 14.2, "median": 67.5,
    "p25": 58.0, "p75": 78.5, "p90": 85.3,
    "gradeDistribution": { "A": 18, "B": 41, "C": 35, "D": 21, "F": 12 }
  },
  "coverage": { "requested": 150, "scored": 127, "qualified": 59, "nurtured": 47, "disqualified": 21, "errored": 0 },
  "trust": { "provenance": "apify-dataset:abcDEF...", "sourceCoverage": 0.78, "conflictCount": 4, "sampleSize": 150 },
  "notifications": [],
  "modeUsed": "balanced",
  "personaUsed": "outbound-sdr",
  "outputProfile": "standard",
  "weightsUsed": { "industry": 22.5, "companySize": 17.5, "services": 17.5, "contactPresence": 20, "intentSignals": 12.5, "dataCompleteness": 10 },
  "icpConfig": {
    "targetIndustries": ["Marketing Agency", "Digital Agency"],
    "targetCompanySizes": ["11-50", "51-200"],
    "targetServices": ["SEO", "PPC", "Content Marketing"],
    "targetTechStack": ["HubSpot", "Google Analytics"]
  },
  "spendingLimitReached": false,
  "chargedEvents": 127,
  "chargedUsd": 3.81,
  "summary": "Scored 127 of 150 leads | 59 qualify, 47 nurture, 21 disqualify | mode=balanced, persona=outbound-sdr",
  "scoredAt": "2026-05-04T09:22:45.000Z"
}
```

### Output fields

#### Per-lead record (`recordType: "lead"`)

| Field | Type | Description |
|---|---|---|
| `schemaVersion` | string | Output contract version. Additive only across minor versions (`1.1.0`). |
| `recordType` | string | Discriminator: `"lead"` for scored leads. |
| `entityId` | string | Stable cross-suite canonical id (sha256 of `domain + companyName`). Suite-aligned name; same join key as `waterfall-contact-enrichment`, `phone-number-finder`, `bulk-email-verifier`, `company-deep-research`, and `lead-enrichment-pipeline`. |
| `eventId` | string | Legacy alias of `entityId` (same value). Kept for back-compat with existing downstream pipelines. |
| `signalIndependence` | object | `{ score, distinctSourceCount, totalComponentCount, interpretation, warning? }`. Catches the "looks like 6 corroborating signals but really 1 echoed 6 times" trap. Aligned with `waterfall-contact-enrichment`, `company-deep-research`, `phone-number-finder`, and `bulk-email-verifier`. |
| `counterfactual` | object | `{ droppedComponent, withoutThisSignal: { score, level, grade }, interpretation }`. Drops the highest-weight ICP factor and recomputes — tells you whether the lead's grade is load-bearing on a single factor. |
| `decisionMemory` | object|null | Closes the feedback loop when `lastAction` is provided as input. `{ outcome: 'engaged' \| 'no-response' \| 'no-change' \| 'resolved' \| 'too-soon-to-tell', daysSinceAction, confidence, inferenceMethod, epistemicStatus }`. Honest: only ICP-score movement is observable. |
| `decision` | string | Top-level routing: `"qualify"` | `"nurture"` | `"disqualify"`. |
| `confidence` | object | `{ score: 0-1, level: "high\|medium\|low\|very-low", components: [...] }`. |
| `confidenceLevel` | string | Banded confidence string for branching: `high` (≥0.8), `medium` (≥0.6), `low` (≥0.4), `very-low` (<0.4). |
| `recommendedAction` | object | `{ actionId, label, owner, eta, costEstimate }`. ActionIds: `outreach-now`, `personalised-outreach`, `nurture-campaign`, `enrich-first`, `skip`. |
| `task` | object | DryRun task: `{ id, kind, target, payload, owner, deadline, dryRun: true }`. Kinds: `outreach`, `nurture`, `enrich`, `archive`. |
| `agentContract` | object | Compact agent surface: `{ decision, confidence, nextAction, costToAct }`. |
| `send` | string | Outreach send decision: `"yes"` | `"no"` | `"hold"`. |
| `shouldAct` | boolean | Hard gate for auto-execution. True only when decision=qualify and contact data is present. |
| `summary` | string | ≤280-char plain-English summary. LLM-friendly. |
| `whyThisMatters` | string | ≤200-char rationale for the decision. |
| `whyNow` | string | ≤200-char timing rationale (only present when decision=qualify). |
| `openingAngle` | string | ≤200-char first-touch sentence referencing something specific to the lead. |
| `icpScore` | number | Overall ICP score, 0–100 (one decimal place). Higher is better. |
| `icpGrade` | string | Letter grade: A (80–100), B (65–79), C (50–64), D (35–49), F (0–34). |
| `icpFactors` | object | `industryMatch`, `companySizeMatch`, `servicesMatch`, `contactPresence`, `intentSignals`, `dataCompleteness` — each 0-100 raw before weight application. |
| `icpNotes` | string\[] | Per-dimension text explanations. |
| `scoringTrace` | object\[] | Per-rule reproducibility: `[{ rule, weight, rawScore, contribution }, ...]`. |
| `decisionRisk` | object | `{ downsideIfWrong, upsideIfRight, asymmetryRatio }`. |
| `warnings` | object\[] | `[{ severity: "critical\|warning\|info", code, message }, ...]`. |
| `contradictions` | object\[] | Pairs of signals pointing opposite ways. Empty when no conflict detected. |
| `dataGaps` | object\[] | `[{ field, reason, suggestedFix }, ...]` — missing fields with the right enrichment actor to fix them. |
| `fixPlan` | object | Ordered remediation steps. Present only when `dataGaps` is non-empty. |
| `nextBestActorSlug` | string | Apify actor slug to chain after this one (e.g., `ryanclinton/lead-enrichment-pipeline`). |
| `buyingCommittee` | object | `{ decisionMaker[], champion[], blocker[], user[] }` — present when contacts have titles. |
| `qualificationRisk` | number | 0-100; inverse of confidence. Higher = more risk the decision is wrong. |
| `coldStart` | boolean | True when cohort has <25 leads. Confidence is capped at 0.5 in this case. |
| `agenticReadiness` | number | 0-100; how well-equipped this record is for an agent loop to act on. |
| `priorityRank` | number | 1-indexed rank in the (sorted) cohort. 1 = top lead. |
| `percentileInCohort` | number | 0-100; this lead's score percentile within the run's cohort. |
| `scoredAt` | string | ISO 8601 timestamp. |
| *(all original lead fields)* | mixed | Every field from the input lead record is preserved unchanged in the output. |

#### Run summary record (`recordType: "summary"`)

| Field | Type | Description |
|---|---|---|
| `schemaVersion` / `recordType` | string | `"1.1.0"` / `"summary"`. |
| `runId` | string | Apify run ID (or `local-<timestamp>` outside the platform). |
| `totalInput` / `totalScored` / `totalPushed` / `filteredOut` | number | Cohort funnel counts. |
| `cohort` | object | `{ n, mean, stdev, median, p25, p75, p90, gradeDistribution }`. |
| `coverage` | object | `{ requested, scored, qualified, nurtured, disqualified, errored }`. |
| `trust` | object | `{ provenance, sourceCoverage, conflictCount, sampleSize }`. |
| `notifications` | object\[] | Auto-generated alerts (low coverage, cold start, zero qualified, ICP overfit). |
| `modeUsed` / `personaUsed` / `outputProfile` | string | Resolved preset values (after auto-resolution if `mode: "auto"`). |
| `weightsUsed` | object | The normalised weights actually applied. |
| `chargedEvents` / `chargedUsd` | number | PPE charge totals for this run. |
| `spendingLimitReached` | boolean | `true` if the PPE spending cap was hit mid-run. |
| `summary` | string | One-line summary. |

#### Key-Value Store mirrors

| Key | Format | Description |
|---|---|---|
| `SUMMARY` | JSON | Same content as the summary dataset record — pin this for cross-run polling. |
| `OUTPUT` | JSON | Top 25 decisions in compact shape — fast for dashboard polling without listing the dataset. |
| `OUTPUT.csv` | CSV | Apollo / Outreach.io / Salesloft compatible columns. Written when `csvExport: true` (default). |
| `RECEIPTS` | JSON | Per-charge audit trail: `[{ timestamp, action, cost, eventId }, ...]`. |

### How much does it cost to score leads?

Lead Scoring Engine uses **pay-per-event pricing** — you pay **$0.03 per lead scored**. Platform compute costs are included. Scoring happens in-process with no external API calls, so there are no variable costs from third-party services.

| Scenario | Leads | Cost per lead | Total cost |
|---|---|---|---|
| Quick test | 10 | $0.03 | $0.30 |
| Small batch | 100 | $0.03 | $3.00 |
| Medium batch | 500 | $0.03 | $15.00 |
| Large batch | 2,000 | $0.03 | $60.00 |
| Enterprise | 10,000 | $0.03 | $300.00 |

You can set a **maximum spending limit** per run in the Apify console. The actor stops when your budget is reached and marks `spendingLimitReached: true` in the summary record. Per-charge audit trail is written to the run's KV under the `RECEIPTS` key.

**Charges fire only after a lead is pushed to the dataset.** If `minScoreToInclude` filters a lead out, it is never pushed and never charged. The PPE charge happens *after* `pushData` succeeds, so a network or platform failure cannot leave you paying for output that didn't arrive.

Compare this to manual qualification: an SDR spending 3 minutes qualifying each lead costs $25–50 per hour, meaning 100 leads cost $125–250 in labour. With this actor the same 100 leads cost $3.00 and return in under 30 seconds.

### Score leads using the API

#### Python

```python
from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("ryanclinton/lead-scoring-engine").call(run_input={
    "datasetId": "aBcDeFgHiJkLmNoP",
    "targetIndustries": ["Marketing Agency", "Digital Agency"],
    "targetCompanySizes": ["11-50", "51-200"],
    "targetServices": ["SEO", "PPC", "Content Marketing"],
    "minScoreToInclude": 65,
    "outputSortedByScore": True,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item.get("recordType") == "summary":
        print(f"Run complete — A:{item['gradeDistribution']['A']} B:{item['gradeDistribution']['B']} avg:{item['averageScore']}")
        print(f"Coverage: {item['coverage']}  |  Mode: {item['modeUsed']}, Persona: {item['personaUsed']}")
    elif item.get("recordType") == "lead":
        print(f"{item.get('companyName')} | Score: {item['icpScore']} | Decision: {item['decision']} | Action: {item['recommendedAction']['label']}")
```

#### JavaScript

```javascript
import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "YOUR_API_TOKEN" });

const run = await client.actor("ryanclinton/lead-scoring-engine").call({
    datasetId: "aBcDeFgHiJkLmNoP",
    targetIndustries: ["Marketing Agency", "Digital Agency"],
    targetCompanySizes: ["11-50", "51-200"],
    targetServices: ["SEO", "PPC", "Content Marketing"],
    minScoreToInclude: 65,
    outputSortedByScore: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    if (item.recordType === "summary") {
        console.log(`Run complete — avg ${item.averageScore}, qualified ${item.coverage.qualified}, mode=${item.modeUsed}, persona=${item.personaUsed}`);
    } else if (item.recordType === "lead") {
        console.log(`${item.companyName} — ${item.icpScore} (${item.icpGrade}) → ${item.decision}: ${item.recommendedAction.label}`);
    }
}
```

#### cURL

```bash
## Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~lead-scoring-engine/runs?token=YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "datasetId": "aBcDeFgHiJkLmNoP",
    "targetIndustries": ["Marketing Agency", "Digital Agency"],
    "targetCompanySizes": ["11-50", "51-200"],
    "targetServices": ["SEO", "PPC", "Content Marketing"],
    "minScoreToInclude": 65,
    "outputSortedByScore": true
  }'

## Fetch results (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
```

### How Lead Scoring Engine works

#### Weight normalisation

When the actor starts, `normaliseWeights()` in `scorer.ts` accepts the six raw weight inputs and forces them to sum exactly to 100. It clamps each value to zero minimum, sums all six, and scales each proportionally using `weight = rawWeight / total * 100`, rounded to one decimal place. If all weights are zero (an edge case), the defaults (25/20/20/15/10/10) are restored. This means you can set weights like 3/2/2/1/1/1 and the engine will scale them correctly to 30/20/20/10/10/10 — no manual arithmetic required.

#### Per-dimension scoring

Each of the six dimension modules (`dimensions/industry.ts`, `dimensions/company-size.ts`, etc.) receives the raw lead record and the relevant ICP targets, and returns a `DimensionResult` object with a `score` (0–100), a `maxScore` (always 100), and a `notes` array.

- **Industry**: reads `industry`, `vertical`, `category`, `niche`, and `services` fields; resolves targets through a 10-group alias map; returns 100 for exact match, 60 for partial substring match, 50 for no-target (neutral), 0 for no match
- **Company size**: parses numeric counts, range strings ("11-50"), and text labels ("mid-market") into one of 8 employee bands; returns 100 for exact band match, 50 for adjacent band (±1 position on the ordered band list), 0 for all others
- **Services**: expands each target through a 15-service synonym library; checks `services` and `techStack` fields; scores 100 for 3+ matches, 80 for 2, 50 for 1, 0 for none
- **Contact presence**: additive scoring — 40 pts for validated email (regex `/^[^@\s]+@[^@\s]+\.[^@\s]+$/`), 20 pts for named contact or LinkedIn URL, 20 pts for phone, 20 pts for 2+ named contacts or a titled contact
- **Intent signals**: additive scoring — 30 pts for rating ≥4.5 or reviewCount ≥100, 25 pts for active hiring (jobPostings or hiringCount), 25 pts for chat widget or contact form, 20 pts for intent/tech keywords; returns neutral 50 if no signal data present
- **Data completeness**: 5 groups × 20 pts each — identity (domain/website), company name, contact info (email or phone), location (address/city/country), and profile (description, foundedYear, revenue, or minProjectSize)

#### Score assembly and grading

The final `icpScore` is computed as the sum of each dimension's proportional contribution: `(dimensionScore / 100) * dimensionWeight`. The six weighted values are summed and rounded to one decimal place. The grade thresholds are fixed: A ≥80, B ≥65, C ≥50, D ≥35, F <35. Both the score and grade are written directly onto the lead record alongside the per-dimension `icpFactors` and `icpNotes` arrays.

#### Dataset loading and PPE charging

When `datasetId` is provided, the actor paginates through the dataset in 1,000-item batches using the Apify client's `listItems` with `limit` and `offset` parameters, accumulating records up to `maxLeads`. The run runs in two phases: **Phase 1** scores every lead in memory (deterministic, no I/O, no charging). **Phase 2** sorts the cohort, then iterates lead-by-lead — pushing the record to the dataset and charging the PPE event `"lead-scored"` *after* the push succeeds. If `eventChargeLimitReached` flips true mid-batch, the actor stops cleanly and writes `spendingLimitReached: true` in the summary. Charge-after-push means a network or platform failure cannot leave the customer paying for output that didn't arrive.

#### Decision layer (v1.1)

After dimension scoring, every record passes through a decision-derivation layer:

1. **Decision** — score is mapped to `qualify` / `nurture` / `disqualify` against `qualifyThreshold` (default 65) and `disqualifyThreshold` (default 35).
2. **Confidence** — five weighted components (industry signal, company-size signal, contact presence, data completeness, intent) produce a 0-1 score, banded into `high`/`medium`/`low`/`very-low`. Cohorts of <25 leads have confidence capped at 0.5 (cold-start protection).
3. **Recommended action** — derived from decision + persona + dataGaps. If the lead is qualifyable but missing an email, action is `enrich-first` (with the right enrichment actor named in the fix plan). If qualified and persona is `account-exec`, action is `personalised-outreach`.
4. **DryRun task** — a universal task object is built from the action: `{ id: hash(eventId+actionId), kind, target: domain, payload, owner, deadline, dryRun: true }`.
5. **Agent contract** — compact surface for downstream agents: `{ decision, confidence, nextAction, costToAct }`.
6. **Cohort pass** — after all leads are scored, cohort statistics (mean, stdev, p25/p75/p90) and `percentileInCohort` per lead are computed; `priorityRank` is assigned after sort.

#### Mode and persona presets

`mode` shapes WHAT scoring optimises for: `fast` (industry+size, ignore completeness), `balanced` (default 25/20/20/15/10/10), `thorough` (intent + completeness up-weighted). `auto` picks based on cohort size + estimated data richness.

`persona` shapes WHO scoring serves: `outbound-sdr` (contact + intent), `account-exec` (industry + size), `growth-marketer` (intent + completeness). When set, persona weights are averaged with mode weights, and per-dimension `weight*` overrides win.

#### Stable enum tokens (additive across minor versions)

| Field | Tokens |
|---|---|
| `decision` | `qualify`, `nurture`, `disqualify` |
| `confidenceLevel` | `high`, `medium`, `low`, `very-low` |
| `recommendedAction.actionId` | `outreach-now`, `personalised-outreach`, `nurture-campaign`, `enrich-first`, `skip` |
| `task.kind` | `outreach`, `nurture`, `enrich`, `archive` |
| `send` | `yes`, `no`, `hold` |
| `mode` (input) | `auto`, `fast`, `balanced`, `thorough` |
| `persona` (input) | `generic`, `outbound-sdr`, `account-exec`, `growth-marketer` |
| `outputProfile` (input) | `minimal`, `standard`, `full`, `llm` |
| `recordType` | `lead`, `summary`, `error` |
| `severity` (in `warnings`, `notifications`) | `critical`, `warning`, `info` |

### Tips for best results

1. **Define at least `targetIndustries` and `targetCompanySizes` before anything else.** These two dimensions account for 45 points at default weights and have the largest impact on final scores. An ICP with only these two configured will already produce meaningful lead tiers.

2. **Use `minScoreToInclude: 50` for most pipeline use cases.** Leads scoring below 50 (grade C or lower) rarely convert without further qualification. Filtering them at scoring time reduces CRM clutter and downstream enrichment costs.

3. **Increase `weightContactPresence` for cold outreach campaigns.** If your SDRs need an email or phone to reach out, raise this weight to 25 or 30. Leads with no contact data will score significantly lower and sort below actionable leads.

4. **Increase `weightIntentSignals` for growth-focused targeting.** If you sell to companies that are actively scaling, raise the intent weight to 20 or 25. Leads with active hiring, high review volume, and engagement tools will rank higher than equivalently-sized but dormant companies.

5. **Chain with [Waterfall Contact Enrichment](https://apify.com/ryanclinton/waterfall-contact-enrichment) after scoring.** Run enrichment only on A and B grade leads (pass `minScoreToInclude: 65`). This cuts enrichment cost by 50–70% compared to enriching the entire raw list.

6. **Use the `icpNotes` field to diagnose score distribution issues.** If most leads are scoring 30–40 and you expect higher, read the notes on a few low-scoring records. A common cause is `targetCompanySizes` bands that don't match the size format in your lead data — for example, passing "small" when the lead has `employeeCount: 45` works, but passing "SMB" with no numeric count does not.

7. **Re-score the same dataset with different weights to test ICP hypotheses.** Because the engine is deterministic, you can run the same `datasetId` twice with different weight configs and compare grade distributions in the summary records to see which ICP definition produces the most A-grade leads from your existing data.

8. **Set `outputSortedByScore: false` if you need to preserve the original lead order** — for example, when the upstream dataset is sorted by Google Maps ranking or scraping order and you want to maintain that sequence for reporting.

### Use in Dify

Drop this actor into [Dify](https://docs.apify.com/platform/integrations/dify) workflows via the Apify plugin's Run Actor node. Each lead returns scored, classified, and recommended as structured JSON — `qualify` / `nurture` / `disqualify` plus a `shouldAct` boolean and a typed `recommendedAction.actionId` your downstream node branches on. Competitor pipelines pointed at the same lead list return raw scraped fields; this returns decisions you can route, gate, and ticket-fill from directly.

- **Actor ID:** `ryanclinton/lead-scoring-engine`
- **Sample input** (score an upstream scraper's dataset for an outbound SDR team):

```json
{
  "datasetId": "aBcDeFgHiJkLmNoP",
  "mode": "auto",
  "persona": "outbound-sdr",
  "outputProfile": "standard",
  "targetIndustries": ["Marketing Agency", "Digital Agency"],
  "targetCompanySizes": ["11-50", "51-200"],
  "targetServices": ["SEO", "PPC", "Content Marketing"],
  "qualifyThreshold": 65,
  "minScoreToInclude": 50,
  "csvExport": true
}
```

#### Dify if/else routing

A single Dify if/else node branches on `decision` and routes to the right downstream actor:

| Branch condition | Action |
|---|---|
| `decision == "qualify"` AND `shouldAct == true` | Run [AI Outreach Personalizer](https://apify.com/ryanclinton/ai-outreach-personalizer) on the lead. The actor's `recommendedAction.executionHint.targetActorSlug` already names this. |
| `decision == "qualify"` AND `recommendedAction.actionId == "enrich-first"` | Run [Lead Enrichment Pipeline](https://apify.com/ryanclinton/lead-enrichment-pipeline) on the lead first; re-score after enrichment. |
| `decision == "nurture"` | Push to the marketing nurture list (HubSpot, Customer.io, etc.). |
| `decision == "disqualify"` | Archive — skip outreach. |
| Summary record `recordType == "summary"` AND `decisionReadiness == "insufficient-data"` | Stop the workflow — cohort too small for confidence. Re-run with ≥25 leads. |

The full `recommendedAction` object is usable verbatim in a Dify Code node — `recommendedAction.label` becomes a Slack alert message, `recommendedAction.owner` becomes the assignee tag, `recommendedAction.eta` becomes the deadline, and `task.id` becomes a stable idempotency key for ticket creation.

#### Opt-in modes Dify workflows can leverage

- `mode: "auto"` — Dify workflows usually pass heterogeneous batches; auto-resolve picks the right preset per call.
- `outputProfile: "llm"` — emits agent-optimised records with `summary`, `whyThisMatters`, `whyNow`, `openingAngle`, and `scoringTrace` only. Drops cleanly into a Dify LLM node prompt without preprocessing.
- `outputProfile: "minimal"` — emits decision + action only. Keeps Dify variable size small for high-throughput branching.
- `csvExport: true` — `OUTPUT.csv` lands in the run's Key-Value Store with Apollo / Outreach.io / Salesloft compatible columns. Read via the KV node and push directly to Apollo's CSV import endpoint.

The `recommendedAction` action playbook is usable verbatim — no LLM rewriting required. The `task` object is wire-compatible with the universal task schema (`id`, `kind`, `target`, `payload`, `owner`, `deadline`, `dryRun: true`) consumed by Jira / Linear / GitHub Issues integrations downstream.

### Combine with other Apify actors

| Actor | How to combine |
|---|---|
| [B2B Lead Gen Suite](https://apify.com/ryanclinton/b2b-lead-gen-suite) | Full pipeline: pass the output dataset ID from B2B Lead Gen Suite directly as `datasetId` to score every scraped lead against your ICP in one chained run |
| [Google Maps Email Extractor](https://apify.com/ryanclinton/google-maps-email-extractor) | Extract local business leads with emails from Google Maps, then score the output dataset to identify which local businesses match your agency's ICP |
| [Website Contact Scraper](https://apify.com/ryanclinton/website-contact-scraper) | Scrape contact details from a list of company websites, then score the enriched records to prioritise outreach by ICP fit |
| [Waterfall Contact Enrichment](https://apify.com/ryanclinton/waterfall-contact-enrichment) | Run scoring first; pass only A and B grade leads (score ≥65) into enrichment to reduce enrichment cost by 50–70% |
| [HubSpot Lead Pusher](https://apify.com/ryanclinton/hubspot-lead-pusher) | Push scored leads with `icpScore` and `icpGrade` fields into HubSpot as contact properties, then use HubSpot workflows to route by grade |
| [Bulk Email Verifier](https://apify.com/ryanclinton/bulk-email-verifier) | After scoring, verify emails only on A and B grade leads before handing to SDRs — avoids bounce rates from low-quality contacts |
| [B2B Lead Qualifier](https://apify.com/ryanclinton/b2b-lead-qualifier) | Use alongside this actor for a 30-signal deep-qualification pass on your top-scoring leads; Lead Scoring Engine provides the first filter, B2B Lead Qualifier provides the deep profile |
| [Lead Enrichment Pipeline](https://apify.com/ryanclinton/lead-enrichment-pipeline) | All-in-one Clay alternative: email discovery, verification, company research, and scoring in one run ($0.12/lead) |
| [AI Outreach Personalizer](https://apify.com/ryanclinton/ai-outreach-personalizer) | Generate personalized cold emails using your own OpenAI/Anthropic key — zero AI markup ($0.01/lead) |
| [Intent Signal Tracker](https://apify.com/ryanclinton/intent-signal-tracker) | Track buying signals: hiring, tech changes, funding, content updates. Prioritize outreach by intent score ($0.05/company) |
| [Lead Data Quality Auditor](https://apify.com/ryanclinton/enrichment-quality-auditor) | Audit lead data quality before outreach — email verification, phone validation, domain freshness ($0.005/record) |

### Limitations

- **No live data enrichment** — the actor scores only the fields already present in the lead record. If a lead is missing `industry` or `companySize`, those dimensions return 0 or neutral rather than fetching the data from an external source. Use [Waterfall Contact Enrichment](https://apify.com/ryanclinton/waterfall-contact-enrichment) upstream to add missing fields before scoring.
- **Industry matching is text-based** — the fuzzy match works against the 10 built-in alias groups. Industry terms outside those groups must match verbatim (exact or partial substring). Highly niche verticals (e.g. "maritime logistics", "precision agriculture") may not match aliases and will require exact string configuration.
- **Intent signals require upstream data** — the intent dimension scores neutrally (50) when no signal fields are present in the lead record. Rating, review count, job postings, and chat widget data must be scraped upstream (e.g. by [Google Maps Email Extractor](https://apify.com/ryanclinton/google-maps-email-extractor) or [Website Contact Scraper](https://apify.com/ryanclinton/website-contact-scraper)) and included in the lead object.
- **Company size bands are fixed** — the 8 bands (1–10, 11–50, 51–200, 201–500, 501–1000, 1001–5000, 5001–10000, 10001+) cannot be customised. Very large or very small employee thresholds that fall outside these bands cannot be expressed.
- **No deduplication** — if the input leads array or dataset contains duplicate domain entries, each will be scored and charged separately. Deduplication should happen upstream.
- **Maximum 10,000 leads per run by default** — set `maxLeads` up to 100,000 if needed, but very large runs on 256 MB memory are possible due to the streaming pagination approach.
- **No partial tech stack matching within a single string** — tech stack matching checks for the target term as a substring of the combined lead text. If a lead stores tech as `"shopify-plus-theme"` and you target `"Shopify"`, it will match. But deeply abbreviated or encoded tech strings may not resolve correctly.

### Integrations

- [Zapier](https://apify.com/integrations/zapier) — trigger a lead scoring run when new leads are added to a Google Sheet or CRM, then write scored results back automatically
- [Make](https://apify.com/integrations/make) — build multi-step scenarios that scrape leads, score them, filter by grade, and push A/B leads to HubSpot or ActiveCampaign
- [Google Sheets](https://apify.com/integrations/google-sheets) — export scored leads directly to a Sheet for manual review, sorting by `icpScore` column to surface top prospects
- [Apify API](https://docs.apify.com/api/v2) — chain scoring runs programmatically using the dataset ID from any upstream run; retrieve results in JSON, CSV, or XLSX format
- [Webhooks](https://docs.apify.com/platform/integrations/webhooks) — trigger downstream actions (Slack notification, CRM push, email alert) when a run completes and the summary shows more than N grade-A leads
- [LangChain / LlamaIndex](https://docs.apify.com/platform/integrations) — feed scored lead data into an AI agent that generates personalised outreach copy ranked by `icpScore`, targeting only A and B grade leads

### Troubleshooting

**Most leads scoring 0 on the industry dimension despite having industry data.** Check that your `targetIndustries` values match the format in the lead record. The engine checks `industry`, `vertical`, `category`, `niche`, and `services` fields. If the lead stores industry as "Digital Marketing" and you target "Marketing Agency", the fuzzy alias resolves correctly. But if the lead has no industry field at all, the dimension returns 0. Inspect `icpNotes[0]` on a low-scoring record — it will state exactly which fields were read and what the target was.

**Company size dimension returning 0 despite employee data present.** The actor reads `employeeCount`, `teamSize`, `employees`, `headcount`, and `companySize` fields. Ensure at least one of these is present and contains a number or a parseable string like "45 employees", "11-50", or "mid-market". A field named `staff` or `team_size` (snake\_case) will not be read — map it to a supported field name upstream.

**`spendingLimitReached: true` in the summary record.** The PPE spending cap was hit before all leads were processed. Leads processed before the limit was reached are in the dataset. Either increase the per-run spending limit in the Apify console, or reduce the number of leads per run by lowering `maxLeads`.

**Run completes but output dataset is empty.** This happens when `minScoreToInclude` is set too high for the data quality of the input leads. Check the summary record (it is always written regardless of the filter) for the grade distribution. If all leads are grade F or D, lower `minScoreToInclude` to 0, inspect the `icpNotes` on a sample of records, and adjust your ICP configuration accordingly.

**Charge count is lower than total input.** Expected behaviour as of v1.1 — `minScoreToInclude` filters BEFORE charging, and the charge fires only after a lead is pushed to the dataset. Total charges therefore equal `totalPushed` in the summary record, not `totalInput`. If you want every input lead charged regardless of grade, set `minScoreToInclude: 0` (the default).

### Responsible use

- This actor processes only the lead data you supply — it does not scrape any websites or call any external APIs.
- When using scored lead data for outreach, comply with GDPR, CAN-SPAM, CASL, and other applicable data protection laws in your jurisdiction.
- Do not use scored lead data for spam, harassment, or unsolicited contact outside the terms of service of your outreach platform.
- Ensure you have a lawful basis for processing personal data (including email addresses) contained in the lead records you supply.
- For guidance on web scraping legality, see [Apify's guide](https://blog.apify.com/is-web-scraping-legal/).

### FAQ

**How does lead scoring against an ICP work?** The actor evaluates each lead across six dimensions — industry match, company size, services alignment, contact presence, intent signals, and data completeness — and combines weighted dimension scores into a single 0–100 number. Dimension weights default to 25/20/20/15/10/10 and can be customised. The final score determines a letter grade (A through F) using fixed thresholds.

**How many leads can I score in one run?** Up to 100,000 (set via `maxLeads`). The default cap is 10,000. The actor paginates large datasets in 1,000-item batches to stay within the 256 MB memory allocation.

**Does lead scoring require exact field names?** The actor reads a defined set of field names (listed in the Limitations section). If your upstream scraper uses different names (e.g. `staff_count` instead of `employeeCount`), map the fields before passing leads to the scoring engine. Renaming can be done in a Make/Zapier step or with a lightweight transformation actor.

**What happens if I don't configure any ICP targets?** Dimensions with no configured targets return a neutral score of 50. If all dimensions return neutral, the final score is 50 (grade C). This is useful for testing the pipeline before you have a defined ICP — all leads score similarly and the output reflects contact presence and data completeness only.

**How is Lead Scoring Engine different from B2B Lead Qualifier?** Lead Scoring Engine scores any lead record you supply against a configurable ICP using six dimensions. It is a computation layer in a pipeline, not a data source. [B2B Lead Qualifier](https://apify.com/ryanclinton/b2b-lead-qualifier) fetches and analyses 30+ signals from external sources about a company. The two work best together: score first to identify which leads are worth deeper qualification, then run B2B Lead Qualifier on grade-A leads only.

**Can I use custom ICP dimensions beyond the six built-in ones?** Not currently. The six dimensions (industry, company size, services, contact presence, intent signals, data completeness) cover the most common B2B qualification criteria. If you need additional dimensions — for example, a revenue threshold or geographic filter — pre-filter your leads before passing them to the actor.

**How accurate is the industry matching?** Exact matches (target term equals a field value verbatim or via alias) score 100% accurately. Partial matches (target term appears as a substring of combined field text) score 60 and are correct most of the time but can produce false positives — for example, targeting "SEO" would partially match a company description mentioning "our CEO". For high-precision industry filtering, ensure your leads have a dedicated `industry` field from upstream enrichment.

**Is it legal to score leads using this actor?** Yes — the actor processes data you supply and makes no external data requests. The legality of your outreach depends on how you obtained the lead data and how you use it, not on the scoring computation itself. Ensure your lead acquisition and outreach comply with GDPR, CAN-SPAM, and relevant local laws.

**Can I schedule lead scoring to run automatically?** Yes. Use Apify's built-in scheduling to trigger a scoring run daily or weekly. Point `datasetId` at the output of a scheduled upstream actor (e.g. [Google Maps Email Extractor](https://apify.com/ryanclinton/google-maps-email-extractor)) and the scoring run will process new leads automatically as they are collected.

**How long does a typical run take?** Scoring 100 leads takes under 10 seconds. Scoring 1,000 leads takes approximately 30–60 seconds, primarily due to actor startup time. Scoring 10,000 leads takes 2–4 minutes. If `datasetId` points to a large dataset, loading time adds 10–20 seconds per 10,000 records retrieved.

**What is the difference between `icpScore` and `icpFactors`?** `icpScore` is the final weighted score (0–100) used for grading and sorting. `icpFactors` contains the raw 0–100 score for each dimension before weight application — useful for diagnosing which specific dimension is dragging a lead's score down.

**Can I run this actor from a Cursor or Claude workflow via MCP?** Yes. The Apify platform exposes actor runs through the Apify MCP server. You can call this actor from any LLM agent that supports MCP tool calls, passing leads inline and receiving scored results in the same response cycle.

### Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

1. Go to [Account Settings > Privacy](https://console.apify.com/account/privacy)
2. Enable **Share runs with public Actor creators**

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

### Support

Found a bug or have a feature request? Open an issue in the [Issues tab](https://console.apify.com/actors/lead-scoring-engine/issues) on this actor's page. For custom ICP configurations, pipeline integrations, or enterprise scoring volumes, reach out through the Apify platform.

# Actor input Schema

## `leads` (type: `array`):

Array of lead objects to score. Each object should have fields like domain, companyName, industry, services, emails, contacts, etc. Use this OR datasetId — not both.

## `datasetId` (type: `string`):

Apify dataset ID to load leads from. Use this instead of inline leads when chaining with another actor in a pipeline.

## `goal` (type: `string`):

What outcome to optimise for. 'pipeline-growth' accepts more leads (B/C also pass). 'quick-wins' biases hard toward intent + reachable contacts. 'cost-efficiency' penalises enrichment-required leads. 'high-ltv' biases enterprise + strong ICP. 'generic' = no goal bias. Goal layers ON TOP of mode + persona — per-dimension weights still win.

## `mode` (type: `string`):

Preset that shapes HOW scoring runs. 'auto' picks based on cohort size + data richness. 'fast' weights industry + size, ignores completeness. 'balanced' is the default 25/20/20/15/10/10 mix. 'thorough' up-weights intent + completeness for fully-enriched cohorts. Per-dimension weights below override the preset.

## `persona` (type: `string`):

Preset that shapes WHO scoring serves. SDRs need reachable contacts; AEs need fit before reach; Growth marketers prioritise intent + recency. 'generic' defers to mode preset.

## `outputProfile` (type: `string`):

Controls how much detail goes into each pushed record. 'minimal' = decision + action only. 'standard' = decision + factors + notes (drops scoringTrace). 'full' = everything. 'llm' = optimised for AI agents (summary, why, opening angle, scoring trace).

## `qualifyThreshold` (type: `integer`):

Score at or above this becomes decision='qualify' (default 65 = grade B+). Tune for your team's bandwidth.

## `disqualifyThreshold` (type: `integer`):

Score below this becomes decision='disqualify' (default 35 = grade D-). Leads between disqualify and qualify thresholds get decision='nurture'.

## `csvExport` (type: `boolean`):

When enabled, writes OUTPUT.csv with Apollo / Outreach.io / Salesloft compatible columns to the run's default Key-Value Store. Download from the Storage tab.

## `watchlistName` (type: `string`):

Set to enable cross-run trend tracking. Two runs with the same watchlistName attach `temporalSignals` (trend / momentumScore / scoreDelta / runsSeen / reengage flag) to each lead by canonical eventId. First run shows trend='new' for everything; subsequent runs surface rising/falling/re-engagement leads. Stored in a named KV store, capped at 25k leads FIFO.

## `monitorStateKey` (type: `string`):

Suite-aligned alias for watchlistName. Either input works; if both are set, watchlistName wins. Lets the same upstream orchestrator pass one consistent field name across lead-scoring-engine, waterfall-contact-enrichment, phone-number-finder, bulk-email-verifier, company-deep-research, and lead-enrichment-pipeline.

## `lastAction` (type: `object`):

Optional. Tells the actor what action you took on this watchlist since the last run. On the next scheduled run, the actor compares the current ICP score against the snapshot at action time and emits decisionMemory with an inferred outcome. Honest: only signal-change is observable — direct conversion / closed-deal / off-platform engagement are not. Shape: { type: 'sent-pitch' | 'qualified' | 'disqualified' | string, takenAt: ISO date, note?: string }. Requires watchlistName / monitorStateKey.

## `enableIcpInsights` (type: `boolean`):

Detect ICP-drift from the current run's top performers (grade A leads) and surface industries/sizes that aren't in your declared ICP. Adds an `icpInsights` block to the summary record with `topIndustries`, `topCompanySizes`, `topServices`, and an `icpVsTopPerformerDrift.suggestion` string. Pure compute on this run — no cross-run state needed.

## `enableDedup` (type: `boolean`):

Detect duplicate leads (by canonical domain) within this run's input. Each duplicate gets an `identity` block with `canonicalDomain`, `duplicateCount`, `duplicateRunIndices`, and `isCanonical`. The first occurrence is treated as canonical; later ones are flagged. Does not skip duplicates — flags them so you can choose how to merge upstream.

## `enableEconomics` (type: `boolean`):

Compute expectedValue per lead: conversion-probability proxy (from icpScore × intent × contact richness) × estimated deal size (industry × size proxy table) ÷ cost-to-act (enrichment + verification + SDR labour). Output: `expectedValue.expectedRoi`, `expectedRevenueUsd`, `costToActUsd`, plus `actionDecision: act|delay|ignore` driven by ROI. Industry deal-size proxies are conservative midpoints from public B2B benchmarks — override with `industryDealSizeOverrides` for accuracy.

## `industryDealSizeOverrides` (type: `object`):

Per-industry estimated deal size in USD. User-supplied overrides win against the proxy table. Example: { "SaaS": 25000, "Marketing Agency": 12000, "Real Estate": 8000 }. Applies size multiplier (small × 0.4, mid × 1.0, enterprise × 5.0) on top.

## `sdrCostPerTouch` (type: `integer`):

Override the default SDR labour cost per outreach touch. Default: $5 (12 minutes at $25/hr fully-loaded). Used in cost-to-act computation.

## `constraints` (type: `object`):

Resource limits for this run. When set, the actor sorts leads by ROI and selects the top set within constraints. Each lead gets `allocationDecision: { selected, reason, excludedDueTo, rankInAllocation }`. Example: { "maxOutreachPerRun": 50, "maxEnrichmentPerRun": 100, "budgetUsd": 200 }.

## `simulate` (type: `object`):

When set, the actor scores every lead twice — once with current weights, once with override weights — and emits a `simulation` block per lead showing the score delta and decision change. Use this to test ICP hypotheses without re-running. Example: { "weightIndustry": 30, "weightIntentSignals": 25 }. Doubles compute time but does NOT double PPE charges (you only pay for the primary score).

## `scorecardTemplate` (type: `string`):

Pre-built configuration bundle for a common GTM motion. 'local-agency-outbound' (SMB/mid-market agencies, balanced fit). 'b2b-saas-abm' (enterprise SaaS, AE-led, high-LTV bias, personal-email penalties). 'ecommerce-services' (DTC brands, growth-marketer persona). 'recruiter-sourcing' (intent-heavy for actively-hiring companies). 'custom' (no template — your inputs win). Template fields fill in only where you haven't set them.

## `outcomeDatasetId` (type: `string`):

Apify dataset ID containing past lead outcomes (won deals, meetings booked, revenue). The actor joins your scored leads against this dataset on `outcomeJoinKey` (default 'domain') and computes win rate per grade — proves whether the score is actually predictive. Without this, calibration uses benchmark priors only.

## `outcomeJoinKey` (type: `string`):

Field on outcome dataset records used to match against scored leads. Defaults to 'domain'. Both sides are normalised to canonical lowercase domain.

## `outcomeFields` (type: `object`):

Map outcome dataset field names to the actor's expectations. Example: { "won": "dealWon", "revenue": "closedRevenueUsd", "meetingBooked": "meetingBooked" }. Win-flag values like true/won/closed-won/y/yes/1 all match.

## `negativeRules` (type: `array`):

Array of penalty rules. Each rule deducts `penalty` (0-100) from the final score when the rule matches. Match types: `contains` (substring), `equals` (exact), `matches` (regex). Total penalty per lead is capped at 50 to prevent over-correction. Example: \[{"field":"email","contains":"gmail.com","penalty":15,"reason":"personal-email"}].

## `freshnessConfig` (type: `object`):

Penalise stale records. `dateField` (auto-detected from common date fields like lastVerifiedAt / scrapedAt if blank), `decayAfterDays` (default 90 — penalty starts beyond this), `maxPenalty` (default 25 — cap on penalty). Output: `freshness: { status, ageDays, scorePenalty, recommendedAction }`.

## `enableAccountRollup` (type: `boolean`):

When enabled, the actor groups leads by canonical domain and emits `accountReadiness[]` in the run summary — per-account: contacts found, decision-makers, champions, coverage (single-thread / multi-threaded / no-coverage), readiness (sales-ready / developing / cold). Useful for ABM / buying-committee workflows where account-level signal matters more than per-lead.

## `enableSavingsReport` (type: `boolean`):

When enabled (auto-on when `constraints` is set), the run summary includes a `savings` block reporting avoided cost from excluded leads — leads skipped, enrichment-cost avoided, SDR touches avoided, total spend avoided. Proves the actor's value as a resource-allocator.

## `targetIndustries` (type: `array`):

Industries that match your Ideal Customer Profile. Examples: 'Marketing Agency', 'SEO', 'Web Design', 'SaaS', 'Ecommerce'. Fuzzy matching and synonyms are applied automatically.

## `targetCompanySizes` (type: `array`):

Employee count bands that match your ICP. Use standard bands: '1-10', '11-50', '51-200', '201-500', '501-1000', '1001-5000'. Aliases like 'small', 'mid-market', 'enterprise' are also accepted.

## `targetServices` (type: `array`):

Services your ideal clients offer or use. Examples: 'SEO', 'PPC', 'Web Design', 'Content Marketing'. Used to check lead.services field.

## `targetTechStack` (type: `array`):

Technologies your ideal clients use. Examples: 'HubSpot', 'Shopify', 'WordPress', 'Salesforce'. Used to check lead.techStack field.

## `weightIndustry` (type: `integer`):

Override the preset's industry weight. Leave blank to use the resolved preset.

## `weightCompanySize` (type: `integer`):

Override the preset's company-size weight.

## `weightServices` (type: `integer`):

Override the preset's services weight.

## `weightContactPresence` (type: `integer`):

Override the preset's contact-presence weight.

## `weightIntentSignals` (type: `integer`):

Override the preset's intent-signals weight.

## `weightDataCompleteness` (type: `integer`):

Override the preset's data-completeness weight.

## `minScoreToInclude` (type: `integer`):

Leads with an ICP score below this threshold are excluded from the output dataset entirely. This filter runs BEFORE charging — filtered leads are never pushed and never charged.

## `outputSortedByScore` (type: `boolean`):

When enabled, the output dataset is sorted by icpScore descending so the best leads appear first.

## `maxLeads` (type: `integer`):

Safety cap on the total number of leads processed. Prevents runaway costs when datasetId points to a very large dataset. Default: 10000.

## Actor input object example

```json
{
  "leads": [
    {
      "domain": "brightedge.com",
      "companyName": "BrightEdge",
      "industry": "Marketing Agency",
      "services": [
        "SEO",
        "Content Marketing",
        "Analytics"
      ],
      "companySize": "51-200",
      "emails": [
        "hello@brightedge.com"
      ],
      "contacts": [
        {
          "name": "Sarah Chen",
          "title": "Head of SEO",
          "email": "s.chen@brightedge.com"
        }
      ],
      "phones": [
        "+1 415-555-0182"
      ],
      "address": "1 Market St, San Francisco, CA",
      "rating": 4.7,
      "reviewCount": 143,
      "hasChatWidget": true,
      "hasContactForm": true,
      "techStack": [
        "HubSpot",
        "Google Analytics",
        "Salesforce"
      ],
      "foundedYear": 2011,
      "description": "Enterprise SEO and content performance platform for B2B companies."
    }
  ],
  "goal": "generic",
  "mode": "auto",
  "persona": "generic",
  "outputProfile": "standard",
  "qualifyThreshold": 65,
  "disqualifyThreshold": 35,
  "csvExport": true,
  "watchlistName": "q4-pipeline",
  "enableIcpInsights": false,
  "enableDedup": false,
  "enableEconomics": false,
  "industryDealSizeOverrides": {
    "SaaS": 25000,
    "Marketing Agency": 12000
  },
  "constraints": {
    "maxOutreachPerRun": 50,
    "budgetUsd": 200
  },
  "simulate": {
    "weightIndustry": 30,
    "weightIntentSignals": 25
  },
  "scorecardTemplate": "custom",
  "outcomeJoinKey": "domain",
  "outcomeFields": {
    "won": "dealWon",
    "revenue": "closedRevenueUsd"
  },
  "negativeRules": [
    {
      "field": "email",
      "contains": "gmail.com",
      "penalty": 15,
      "reason": "personal-email-domain"
    },
    {
      "field": "industry",
      "contains": "student",
      "penalty": 30,
      "reason": "student-org"
    }
  ],
  "freshnessConfig": {
    "dateField": "lastVerifiedAt",
    "decayAfterDays": 90,
    "maxPenalty": 25
  },
  "enableAccountRollup": false,
  "targetIndustries": [
    "Marketing Agency",
    "Digital Agency"
  ],
  "targetCompanySizes": [
    "11-50",
    "51-200"
  ],
  "targetServices": [
    "SEO",
    "Content Marketing"
  ],
  "targetTechStack": [
    "HubSpot",
    "Google Analytics"
  ],
  "minScoreToInclude": 0,
  "outputSortedByScore": true,
  "maxLeads": 10000
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "leads": [
        {
            "domain": "brightedge.com",
            "companyName": "BrightEdge",
            "industry": "Marketing Agency",
            "services": [
                "SEO",
                "Content Marketing",
                "Analytics"
            ],
            "companySize": "51-200",
            "emails": [
                "hello@brightedge.com"
            ],
            "contacts": [
                {
                    "name": "Sarah Chen",
                    "title": "Head of SEO",
                    "email": "s.chen@brightedge.com"
                }
            ],
            "phones": [
                "+1 415-555-0182"
            ],
            "address": "1 Market St, San Francisco, CA",
            "rating": 4.7,
            "reviewCount": 143,
            "hasChatWidget": true,
            "hasContactForm": true,
            "techStack": [
                "HubSpot",
                "Google Analytics",
                "Salesforce"
            ],
            "foundedYear": 2011,
            "description": "Enterprise SEO and content performance platform for B2B companies."
        }
    ],
    "goal": "generic",
    "mode": "auto",
    "persona": "generic",
    "outputProfile": "standard",
    "scorecardTemplate": "custom",
    "targetIndustries": [
        "Marketing Agency",
        "Digital Agency"
    ],
    "targetCompanySizes": [
        "11-50",
        "51-200"
    ],
    "targetServices": [
        "SEO",
        "Content Marketing"
    ],
    "targetTechStack": [
        "HubSpot",
        "Google Analytics"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("ryanclinton/lead-scoring-engine").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "leads": [{
            "domain": "brightedge.com",
            "companyName": "BrightEdge",
            "industry": "Marketing Agency",
            "services": [
                "SEO",
                "Content Marketing",
                "Analytics",
            ],
            "companySize": "51-200",
            "emails": ["hello@brightedge.com"],
            "contacts": [{
                    "name": "Sarah Chen",
                    "title": "Head of SEO",
                    "email": "s.chen@brightedge.com",
                }],
            "phones": ["+1 415-555-0182"],
            "address": "1 Market St, San Francisco, CA",
            "rating": 4.7,
            "reviewCount": 143,
            "hasChatWidget": True,
            "hasContactForm": True,
            "techStack": [
                "HubSpot",
                "Google Analytics",
                "Salesforce",
            ],
            "foundedYear": 2011,
            "description": "Enterprise SEO and content performance platform for B2B companies.",
        }],
    "goal": "generic",
    "mode": "auto",
    "persona": "generic",
    "outputProfile": "standard",
    "scorecardTemplate": "custom",
    "targetIndustries": [
        "Marketing Agency",
        "Digital Agency",
    ],
    "targetCompanySizes": [
        "11-50",
        "51-200",
    ],
    "targetServices": [
        "SEO",
        "Content Marketing",
    ],
    "targetTechStack": [
        "HubSpot",
        "Google Analytics",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("ryanclinton/lead-scoring-engine").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "leads": [
    {
      "domain": "brightedge.com",
      "companyName": "BrightEdge",
      "industry": "Marketing Agency",
      "services": [
        "SEO",
        "Content Marketing",
        "Analytics"
      ],
      "companySize": "51-200",
      "emails": [
        "hello@brightedge.com"
      ],
      "contacts": [
        {
          "name": "Sarah Chen",
          "title": "Head of SEO",
          "email": "s.chen@brightedge.com"
        }
      ],
      "phones": [
        "+1 415-555-0182"
      ],
      "address": "1 Market St, San Francisco, CA",
      "rating": 4.7,
      "reviewCount": 143,
      "hasChatWidget": true,
      "hasContactForm": true,
      "techStack": [
        "HubSpot",
        "Google Analytics",
        "Salesforce"
      ],
      "foundedYear": 2011,
      "description": "Enterprise SEO and content performance platform for B2B companies."
    }
  ],
  "goal": "generic",
  "mode": "auto",
  "persona": "generic",
  "outputProfile": "standard",
  "scorecardTemplate": "custom",
  "targetIndustries": [
    "Marketing Agency",
    "Digital Agency"
  ],
  "targetCompanySizes": [
    "11-50",
    "51-200"
  ],
  "targetServices": [
    "SEO",
    "Content Marketing"
  ],
  "targetTechStack": [
    "HubSpot",
    "Google Analytics"
  ]
}' |
apify call ryanclinton/lead-scoring-engine --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=ryanclinton/lead-scoring-engine",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Lead Scoring Engine — ICP Score Leads 0-100",
        "description": "Score leads 0-100 against your Ideal Customer Profile across 6 weighted dimensions: industry, company size, services, contact presence, intent signals, and data completeness. Returns A-F grades + per-dimension notes. No API calls. $0.03/lead.",
        "version": "1.4",
        "x-build-id": "3g6l4FIqf4FcHoLgn"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/ryanclinton~lead-scoring-engine/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-ryanclinton-lead-scoring-engine",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/ryanclinton~lead-scoring-engine/runs": {
            "post": {
                "operationId": "runs-sync-ryanclinton-lead-scoring-engine",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/ryanclinton~lead-scoring-engine/run-sync": {
            "post": {
                "operationId": "run-sync-ryanclinton-lead-scoring-engine",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "leads": {
                        "title": "Leads (inline)",
                        "maxItems": 100000,
                        "type": "array",
                        "description": "Array of lead objects to score. Each object should have fields like domain, companyName, industry, services, emails, contacts, etc. Use this OR datasetId — not both.",
                        "default": [
                            {
                                "domain": "brightedge.com",
                                "companyName": "BrightEdge",
                                "industry": "Marketing Agency",
                                "services": [
                                    "SEO",
                                    "Content Marketing",
                                    "Analytics"
                                ],
                                "companySize": "51-200",
                                "emails": [
                                    "hello@brightedge.com"
                                ],
                                "contacts": [
                                    {
                                        "name": "Sarah Chen",
                                        "title": "Head of SEO",
                                        "email": "s.chen@brightedge.com"
                                    }
                                ],
                                "phones": [
                                    "+1 415-555-0182"
                                ],
                                "address": "1 Market St, San Francisco, CA",
                                "rating": 4.7,
                                "reviewCount": 143,
                                "hasChatWidget": true,
                                "hasContactForm": true,
                                "techStack": [
                                    "HubSpot",
                                    "Google Analytics",
                                    "Salesforce"
                                ],
                                "foundedYear": 2011,
                                "description": "Enterprise SEO and content performance platform for B2B companies."
                            }
                        ]
                    },
                    "datasetId": {
                        "title": "Dataset ID (from upstream actor)",
                        "type": "string",
                        "description": "Apify dataset ID to load leads from. Use this instead of inline leads when chaining with another actor in a pipeline."
                    },
                    "goal": {
                        "title": "Goal",
                        "enum": [
                            "generic",
                            "pipeline-growth",
                            "quick-wins",
                            "cost-efficiency",
                            "high-ltv"
                        ],
                        "type": "string",
                        "description": "What outcome to optimise for. 'pipeline-growth' accepts more leads (B/C also pass). 'quick-wins' biases hard toward intent + reachable contacts. 'cost-efficiency' penalises enrichment-required leads. 'high-ltv' biases enterprise + strong ICP. 'generic' = no goal bias. Goal layers ON TOP of mode + persona — per-dimension weights still win.",
                        "default": "generic"
                    },
                    "mode": {
                        "title": "Scoring Mode",
                        "enum": [
                            "auto",
                            "fast",
                            "balanced",
                            "thorough"
                        ],
                        "type": "string",
                        "description": "Preset that shapes HOW scoring runs. 'auto' picks based on cohort size + data richness. 'fast' weights industry + size, ignores completeness. 'balanced' is the default 25/20/20/15/10/10 mix. 'thorough' up-weights intent + completeness for fully-enriched cohorts. Per-dimension weights below override the preset.",
                        "default": "auto"
                    },
                    "persona": {
                        "title": "Persona",
                        "enum": [
                            "generic",
                            "outbound-sdr",
                            "account-exec",
                            "growth-marketer"
                        ],
                        "type": "string",
                        "description": "Preset that shapes WHO scoring serves. SDRs need reachable contacts; AEs need fit before reach; Growth marketers prioritise intent + recency. 'generic' defers to mode preset.",
                        "default": "generic"
                    },
                    "outputProfile": {
                        "title": "Output Profile",
                        "enum": [
                            "minimal",
                            "standard",
                            "full",
                            "llm"
                        ],
                        "type": "string",
                        "description": "Controls how much detail goes into each pushed record. 'minimal' = decision + action only. 'standard' = decision + factors + notes (drops scoringTrace). 'full' = everything. 'llm' = optimised for AI agents (summary, why, opening angle, scoring trace).",
                        "default": "standard"
                    },
                    "qualifyThreshold": {
                        "title": "Qualify Threshold",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Score at or above this becomes decision='qualify' (default 65 = grade B+). Tune for your team's bandwidth.",
                        "default": 65
                    },
                    "disqualifyThreshold": {
                        "title": "Disqualify Threshold",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Score below this becomes decision='disqualify' (default 35 = grade D-). Leads between disqualify and qualify thresholds get decision='nurture'.",
                        "default": 35
                    },
                    "csvExport": {
                        "title": "Write CSV to Key-Value Store",
                        "type": "boolean",
                        "description": "When enabled, writes OUTPUT.csv with Apollo / Outreach.io / Salesloft compatible columns to the run's default Key-Value Store. Download from the Storage tab.",
                        "default": true
                    },
                    "watchlistName": {
                        "title": "Watchlist Name (Temporal Intelligence)",
                        "type": "string",
                        "description": "Set to enable cross-run trend tracking. Two runs with the same watchlistName attach `temporalSignals` (trend / momentumScore / scoreDelta / runsSeen / reengage flag) to each lead by canonical eventId. First run shows trend='new' for everything; subsequent runs surface rising/falling/re-engagement leads. Stored in a named KV store, capped at 25k leads FIFO."
                    },
                    "monitorStateKey": {
                        "title": "Monitor State Key (alias for watchlistName)",
                        "type": "string",
                        "description": "Suite-aligned alias for watchlistName. Either input works; if both are set, watchlistName wins. Lets the same upstream orchestrator pass one consistent field name across lead-scoring-engine, waterfall-contact-enrichment, phone-number-finder, bulk-email-verifier, company-deep-research, and lead-enrichment-pipeline."
                    },
                    "lastAction": {
                        "title": "Last Action (closes the feedback loop)",
                        "type": "object",
                        "description": "Optional. Tells the actor what action you took on this watchlist since the last run. On the next scheduled run, the actor compares the current ICP score against the snapshot at action time and emits decisionMemory with an inferred outcome. Honest: only signal-change is observable — direct conversion / closed-deal / off-platform engagement are not. Shape: { type: 'sent-pitch' | 'qualified' | 'disqualified' | string, takenAt: ISO date, note?: string }. Requires watchlistName / monitorStateKey."
                    },
                    "enableIcpInsights": {
                        "title": "Enable Passive ICP Insights",
                        "type": "boolean",
                        "description": "Detect ICP-drift from the current run's top performers (grade A leads) and surface industries/sizes that aren't in your declared ICP. Adds an `icpInsights` block to the summary record with `topIndustries`, `topCompanySizes`, `topServices`, and an `icpVsTopPerformerDrift.suggestion` string. Pure compute on this run — no cross-run state needed.",
                        "default": false
                    },
                    "enableDedup": {
                        "title": "Enable Same-Run Deduplication",
                        "type": "boolean",
                        "description": "Detect duplicate leads (by canonical domain) within this run's input. Each duplicate gets an `identity` block with `canonicalDomain`, `duplicateCount`, `duplicateRunIndices`, and `isCanonical`. The first occurrence is treated as canonical; later ones are flagged. Does not skip duplicates — flags them so you can choose how to merge upstream.",
                        "default": false
                    },
                    "enableEconomics": {
                        "title": "Enable ROI / Expected-Value Engine",
                        "type": "boolean",
                        "description": "Compute expectedValue per lead: conversion-probability proxy (from icpScore × intent × contact richness) × estimated deal size (industry × size proxy table) ÷ cost-to-act (enrichment + verification + SDR labour). Output: `expectedValue.expectedRoi`, `expectedRevenueUsd`, `costToActUsd`, plus `actionDecision: act|delay|ignore` driven by ROI. Industry deal-size proxies are conservative midpoints from public B2B benchmarks — override with `industryDealSizeOverrides` for accuracy.",
                        "default": false
                    },
                    "industryDealSizeOverrides": {
                        "title": "Industry Deal-Size Overrides (USD)",
                        "type": "object",
                        "description": "Per-industry estimated deal size in USD. User-supplied overrides win against the proxy table. Example: { \"SaaS\": 25000, \"Marketing Agency\": 12000, \"Real Estate\": 8000 }. Applies size multiplier (small × 0.4, mid × 1.0, enterprise × 5.0) on top."
                    },
                    "sdrCostPerTouch": {
                        "title": "SDR Cost Per Touchpoint (USD)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Override the default SDR labour cost per outreach touch. Default: $5 (12 minutes at $25/hr fully-loaded). Used in cost-to-act computation."
                    },
                    "constraints": {
                        "title": "Run-Level Constraints (Allocation)",
                        "type": "object",
                        "description": "Resource limits for this run. When set, the actor sorts leads by ROI and selects the top set within constraints. Each lead gets `allocationDecision: { selected, reason, excludedDueTo, rankInAllocation }`. Example: { \"maxOutreachPerRun\": 50, \"maxEnrichmentPerRun\": 100, \"budgetUsd\": 200 }."
                    },
                    "simulate": {
                        "title": "Simulation Mode (Override Weights)",
                        "type": "object",
                        "description": "When set, the actor scores every lead twice — once with current weights, once with override weights — and emits a `simulation` block per lead showing the score delta and decision change. Use this to test ICP hypotheses without re-running. Example: { \"weightIndustry\": 30, \"weightIntentSignals\": 25 }. Doubles compute time but does NOT double PPE charges (you only pay for the primary score)."
                    },
                    "scorecardTemplate": {
                        "title": "Scorecard Template (GTM Motion)",
                        "enum": [
                            "custom",
                            "local-agency-outbound",
                            "b2b-saas-abm",
                            "ecommerce-services",
                            "recruiter-sourcing"
                        ],
                        "type": "string",
                        "description": "Pre-built configuration bundle for a common GTM motion. 'local-agency-outbound' (SMB/mid-market agencies, balanced fit). 'b2b-saas-abm' (enterprise SaaS, AE-led, high-LTV bias, personal-email penalties). 'ecommerce-services' (DTC brands, growth-marketer persona). 'recruiter-sourcing' (intent-heavy for actively-hiring companies). 'custom' (no template — your inputs win). Template fields fill in only where you haven't set them.",
                        "default": "custom"
                    },
                    "outcomeDatasetId": {
                        "title": "Outcome Dataset ID (Validate Scoring)",
                        "type": "string",
                        "description": "Apify dataset ID containing past lead outcomes (won deals, meetings booked, revenue). The actor joins your scored leads against this dataset on `outcomeJoinKey` (default 'domain') and computes win rate per grade — proves whether the score is actually predictive. Without this, calibration uses benchmark priors only."
                    },
                    "outcomeJoinKey": {
                        "title": "Outcome Join Key",
                        "type": "string",
                        "description": "Field on outcome dataset records used to match against scored leads. Defaults to 'domain'. Both sides are normalised to canonical lowercase domain.",
                        "default": "domain"
                    },
                    "outcomeFields": {
                        "title": "Outcome Field Names",
                        "type": "object",
                        "description": "Map outcome dataset field names to the actor's expectations. Example: { \"won\": \"dealWon\", \"revenue\": \"closedRevenueUsd\", \"meetingBooked\": \"meetingBooked\" }. Win-flag values like true/won/closed-won/y/yes/1 all match."
                    },
                    "negativeRules": {
                        "title": "Negative Scoring Rules",
                        "type": "array",
                        "description": "Array of penalty rules. Each rule deducts `penalty` (0-100) from the final score when the rule matches. Match types: `contains` (substring), `equals` (exact), `matches` (regex). Total penalty per lead is capped at 50 to prevent over-correction. Example: [{\"field\":\"email\",\"contains\":\"gmail.com\",\"penalty\":15,\"reason\":\"personal-email\"}]."
                    },
                    "freshnessConfig": {
                        "title": "Freshness Decay Config",
                        "type": "object",
                        "description": "Penalise stale records. `dateField` (auto-detected from common date fields like lastVerifiedAt / scrapedAt if blank), `decayAfterDays` (default 90 — penalty starts beyond this), `maxPenalty` (default 25 — cap on penalty). Output: `freshness: { status, ageDays, scorePenalty, recommendedAction }`."
                    },
                    "enableAccountRollup": {
                        "title": "Enable Account-Level Rollup",
                        "type": "boolean",
                        "description": "When enabled, the actor groups leads by canonical domain and emits `accountReadiness[]` in the run summary — per-account: contacts found, decision-makers, champions, coverage (single-thread / multi-threaded / no-coverage), readiness (sales-ready / developing / cold). Useful for ABM / buying-committee workflows where account-level signal matters more than per-lead.",
                        "default": false
                    },
                    "enableSavingsReport": {
                        "title": "Enable Savings Report",
                        "type": "boolean",
                        "description": "When enabled (auto-on when `constraints` is set), the run summary includes a `savings` block reporting avoided cost from excluded leads — leads skipped, enrichment-cost avoided, SDR touches avoided, total spend avoided. Proves the actor's value as a resource-allocator."
                    },
                    "targetIndustries": {
                        "title": "Target Industries",
                        "type": "array",
                        "description": "Industries that match your Ideal Customer Profile. Examples: 'Marketing Agency', 'SEO', 'Web Design', 'SaaS', 'Ecommerce'. Fuzzy matching and synonyms are applied automatically.",
                        "default": [
                            "Marketing Agency",
                            "Digital Agency"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "targetCompanySizes": {
                        "title": "Target Company Sizes",
                        "type": "array",
                        "description": "Employee count bands that match your ICP. Use standard bands: '1-10', '11-50', '51-200', '201-500', '501-1000', '1001-5000'. Aliases like 'small', 'mid-market', 'enterprise' are also accepted.",
                        "default": [
                            "11-50",
                            "51-200"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "targetServices": {
                        "title": "Target Services",
                        "type": "array",
                        "description": "Services your ideal clients offer or use. Examples: 'SEO', 'PPC', 'Web Design', 'Content Marketing'. Used to check lead.services field.",
                        "default": [
                            "SEO",
                            "Content Marketing"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "targetTechStack": {
                        "title": "Target Tech Stack",
                        "type": "array",
                        "description": "Technologies your ideal clients use. Examples: 'HubSpot', 'Shopify', 'WordPress', 'Salesforce'. Used to check lead.techStack field.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "weightIndustry": {
                        "title": "Weight: Industry Match (override)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Override the preset's industry weight. Leave blank to use the resolved preset."
                    },
                    "weightCompanySize": {
                        "title": "Weight: Company Size Match (override)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Override the preset's company-size weight."
                    },
                    "weightServices": {
                        "title": "Weight: Services Match (override)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Override the preset's services weight."
                    },
                    "weightContactPresence": {
                        "title": "Weight: Contact Presence (override)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Override the preset's contact-presence weight."
                    },
                    "weightIntentSignals": {
                        "title": "Weight: Intent Signals (override)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Override the preset's intent-signals weight."
                    },
                    "weightDataCompleteness": {
                        "title": "Weight: Data Completeness (override)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Override the preset's data-completeness weight."
                    },
                    "minScoreToInclude": {
                        "title": "Minimum Score to Include",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Leads with an ICP score below this threshold are excluded from the output dataset entirely. This filter runs BEFORE charging — filtered leads are never pushed and never charged.",
                        "default": 0
                    },
                    "outputSortedByScore": {
                        "title": "Sort Output by Score (Highest First)",
                        "type": "boolean",
                        "description": "When enabled, the output dataset is sorted by icpScore descending so the best leads appear first.",
                        "default": true
                    },
                    "maxLeads": {
                        "title": "Maximum Leads to Score",
                        "minimum": 1,
                        "maximum": 100000,
                        "type": "integer",
                        "description": "Safety cap on the total number of leads processed. Prevents runaway costs when datasetId points to a very large dataset. Default: 10000.",
                        "default": 10000
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```