# Hacker News Search — Stories, Comments & Developer Sentiment (`ryanclinton/hackernews-search`) Actor

Search and extract stories, comments, polls, Show HN, and Ask HN posts from Hacker News. This actor uses the Algolia HN Search API to find content by keyword, filter by author, date range, minimum points, and comment count -- then returns clean, structured JSON ready for analysis, monitoring, or ...

- **URL**: https://apify.com/ryanclinton/hackernews-search.md
- **Developed by:** [Ryan Clinton](https://apify.com/ryanclinton) (community)
- **Categories:** AI, Developer tools
- **Stats:** 30 total users, 16 monthly users, 99.1% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Hacker News Search

Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline.

**Hacker News Intelligence — turns HN search results into ranked signals, trends, thread intelligence, and smart alerts for founders, developer relations teams, researchers, and investors.**

**Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline.**

It is a **developer sentiment monitoring tool**, a **Hacker News trend detection tool**, and a **social listening tool for developers** — focused on high-signal discussions.

It turns raw discussions into ranked, explainable, actionable insights. Instead of reading hundreds of posts, you get the few that actually matter — and what to do about them.

Unlike simple HN scrapers, this actor does not just return posts — it **ranks, explains, expands, compares, and alerts** on developer-community signals. Every result gets a 0–100 signal score (engagement + velocity + author influence + recency). Detect rising keywords with built-in trend detection (current N-day window vs previous N-day window). Expand full comment threads via the HN Firebase API. Compare two periods side-by-side. Auto-split queries that exceed Algolia's 1,000-result cap. Pick a one-click mode for the job (brand monitor, competitor tracking, Who-Is-Hiring extractor, Show HN traction, discover). Schedule it, route smart-filtered alerts to Slack or Discord. Export as JSON, CSV, Excel, or stream through the Apify API. No HN API key required.

The actor combines Algolia HN search, Firebase thread expansion, and deterministic scoring to produce structured, ranked outputs with trend detection and action recommendations. Tools in this category typically combine Algolia HN search and Firebase APIs — this actor implements that pattern with structured outputs and decision signals. Unlike general monitoring tools like Brand24 or Mention, it is purpose-built for Hacker News and developer communities — defining a new category: **developer-signal extraction from high-signal technical communities.**

> Hacker News Intelligence turns raw discussions into ranked, explainable signals.
>
> It is designed to reduce cognitive load by showing only the discussions that matter, why they matter, and what to do next.
>
> Instead of reading hundreds of posts, you get the few that actually matter — and a recommendation for what to do about each one.
>
> It extracts signal from noise in the highest-signal developer community on the internet.
>
> It is the fastest way to understand what developers care about right now.

---

### What is Hacker News Intelligence?

Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline.

Hacker News Intelligence is a tool that analyzes Hacker News data and converts discussions into ranked, actionable developer signals.

It:

- Analyzes Hacker News discussions (Algolia search + HN Firebase API)
- Ranks every result by importance (0–100 signal score)
- Detects trends and developer sentiment (rising n-grams, heuristic insights)
- Suggests actions based on signal (engage / investigate / monitor / ignore)
- Expands full comment threads (reply tree + thread-level summary)
- Compares time periods (rising vs declining keywords)
- Alerts to Slack or Discord (smart-filtered to high-signal mentions only)

It is also the easiest way to monitor Hacker News, track mentions, and detect developer trends in real time.

---

### What makes this different

Most Hacker News tools return posts. This actor returns **decisions**:

- **`signalScore`** — ranks every result 0–100 by importance (engagement + velocity + author influence + recency)
- **`whyThisMatters`** — explains *in plain English* why a result is high-signal
- **`suggestedAction`** — `engage` / `investigate` / `monitor` / `ignore` — the next step you should take
- **`feedbackType`** — `complaint` / `feature_request` / `praise` / `question` for product teams
- **`trendStage`** — `emerging` / `rising` / `peaked` / `declining` for trend records
- **Thread summaries** — full reply-tree expansion plus a one-paragraph aggregate (sentiment + top themes + risk level)
- **Discover mode** — zero-input front-page exploration with trends + insights pre-applied

This turns raw Hacker News data into **actionable developer intelligence** — fed straight into Slack alerts, AI agent tool calls, dashboards, or downstream automation, with no manual analysis step in between.

---

### What problems this solves

Use this actor if you want to:

- **Track mentions of your startup or product on Hacker News** (brand monitoring with alerts) — daily Slack/Discord alerts, smart-filtered to high-signal mentions only
- **Detect emerging developer trends** before they go mainstream (rising n-grams, week-over-week growth)
- **Identify complaints, feature requests, and praise** from real users (heuristic feedback classification)
- **Track competitor activity** in the developer community (smart-alert mode filters noise)
- **Analyze full discussion threads** instead of just headlines (reply-tree expansion + thread-level sentiment)
- **Discover high-signal startup ideas and technologies** early (Show HN traction analytics, GitHub repo signals)
- **Build datasets of developer sentiment and adoption signals** for fine-tuning, RAG, or research
- **Mine Who Is Hiring threads** for structured job listings (company / location / remote / apply URL)

---

### Capabilities at a glance

- **Search and filter** — full-text query against the entire HN archive (2007 → today) via the Algolia API
- **Score and rank mentions** — 0–100 `signalScore` on every result, sortable + filterable
- **Detect emerging trends** — n-gram analysis with current-vs-previous window comparison
- **Classify feedback** — complaint / feature_request / praise / question (heuristic regex)
- **Suggest actions** — engage / investigate / monitor / ignore
- **Expand full comment threads** — reply tree via the HN Firebase API
- **Summarize threads** — aggregate sentiment + themes + risk per thread
- **Compare time periods** — side-by-side delta metrics + topRisingTerms / topDecliningTerms
- **Enrich with author data** — karma, account age, submission count, 0–100 influence score
- **Enrich with GitHub data** — stars, language, last-push, plus freshness + maturity + signal classification
- **Parse Who Is Hiring** — structured job listings from monthly threads
- **Brand-mention alerts** — Slack/Discord webhooks on new mentions, smart-filtered by signal score
- **Discover mode** — zero-input HN front-page exploration with trends + insights pre-applied
- **Auto-pagination beyond 1,000** — adaptive date-bucket splitting for archives

---

### Decision layer (what makes this LLM-native)

This actor is designed to drop straight into LLM agents and automation pipelines without intermediate analysis steps:

| Field | Use it to |
|---|---|
| `signalScore` | Filter high-value mentions (sort DESC, threshold ≥ 50) |
| `signalLevel` | One-glance bucket (`high` / `medium` / `low`) for spreadsheet rules |
| `whyThisMatters` | Drop directly into Slack messages or LLM summaries — no reprocessing needed |
| `suggestedAction` | Branch downstream automation: `engage` / `investigate` / `monitor` / `ignore` |
| `feedbackType` | Route product feedback: complaints to support, feature_requests to PM, praise to marketing |
| `isInfluencerMention` / `influencerTier` | Identify high-credibility voices (top 10% / top 1%) |
| `recordType` | Discriminator across `result` / `thread_comment` / `thread_summary` / `trend` records |
| `commentText` / `text` | Feed raw text into your own LLM pipelines for deeper analysis |

A typical agent workflow: filter `WHERE recordType = 'result' AND suggestedAction IN ('engage', 'investigate')`, post the `whyThisMatters` line to Slack, link `hnUrl` for context, route by `feedbackType`. Zero glue code.

---

### Also useful for

- Developer sentiment analysis
- Product feedback monitoring from engineers
- Startup idea validation
- Open source trend tracking
- Social listening in developer communities
- Early-stage technology discovery
- Competitive intelligence for technical products
- Developer Relations / DevRel signal monitoring
- Investor / VC trend research
- Show HN launch tracking
- Founder market validation
- HN influencer mapping

---

### Works well with AI agents

This actor is built to plug directly into LLM workflows:

- Use `signalScore` and `signalLevel` to filter inputs to high-value mentions only
- Use `suggestedAction` as the routing key for agent tool selection — `engage` triggers a draft-reply tool, `investigate` opens a support ticket, `monitor` posts to a watchlist channel, `ignore` is dropped
- Use `whyThisMatters` and `insightSummary` for direct Slack / email / dashboard rendering — these are pre-written, LLM-quality sentences that need no rewriting
- Feed `commentText` and thread `text` into your own LLM pipelines when you need deeper analysis (sentiment with context, summarization, classification)
- The `recordType` discriminator lets agent tool calls cleanly route across the four record shapes

Ideal for:
- AI copilots that monitor developer communities
- Internal tooling for product / DevRel / support teams
- Automated brand-mention monitoring with Slack/Discord alerts
- RAG knowledge-base ingestion of high-signal HN discussions
- Fine-tuning datasets of structured developer feedback
- Scheduled trend reports for executive stakeholders

---

### Quick start

#### I want brand alerts on Hacker News

```json
{
    "mode": "brand_monitor",
    "query": "MyProduct",
    "alertWebhookUrl": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXX"
}
````

Schedule daily. Get a Slack message every time `MyProduct` hits HN.

#### I want to research a topic

```json
{
    "mode": "search",
    "query": "rust async",
    "detectTrends": true,
    "includeInsights": true,
    "expandThreads": true,
    "maxResults": 50
}
```

Top results + reply trees + heuristic sentiment + theme detection + rising keywords on the topic.

#### I want to discover what's hot on HN right now (no topic needed)

```json
{
    "mode": "discover",
    "query": ""
}
```

Front-page items + rising trends + heuristic insights, no topic specification. Leave the query empty for the full feed, or set a query to filter front-page items by topic.

#### I want full discussion context for a single thread

```json
{
    "query": "Show HN: my product",
    "tags": "story",
    "expandThreads": true,
    "threadMaxDepth": 5,
    "threadMaxComments": 500,
    "maxResults": 1
}
```

The first match's complete reply tree, capped at depth 5 and 500 comments total.

#### I want competitor activity, smart-filtered

```json
{
    "mode": "competitor_tracking",
    "query": "CompetitorName",
    "alertWebhookUrl": "https://hooks.slack.com/services/...",
    "includeAuthorProfile": true
}
```

Only signal-score-≥-50 mentions reach Slack; raw data still in the dataset for review.

#### I want a Show HN snapshot with traction analytics

```json
{
    "mode": "show_hn_analysis",
    "query": "AI",
    "detectTrends": true,
    "includeInsights": true,
    "maxResults": 100
}
```

Show HN posts about AI + trending keywords across them + sentiment + the `SHOW_HN_SUMMARY` aggregate KV record.

#### I want a side-by-side period comparison

```json
{
    "query": "kubernetes",
    "compareMode": "explicit",
    "compareDateFromA": "2026-04-01",
    "compareDateToA": "2026-04-30",
    "compareDateFromB": "2026-03-01",
    "compareDateToB": "2026-03-31"
}
```

`COMPARISON_SUMMARY` KV record with mention deltas and topRising/topDeclining terms.

#### I want more than 1,000 results

```json
{
    "query": "AI",
    "dateFrom": "2025-01-01",
    "dateTo": "2026-04-30",
    "autoSplitLargeQueries": true,
    "maxResults": 5000
}
```

The actor recursively halves the date range when a bucket would exceed 900 hits, fetches each bucket, and dedupes by HN object ID.

#### Record types in the dataset

The actor emits four kinds of dataset records, distinguished by the `recordType` field:

| `recordType` | What it is | When emitted |
|---|---|---|
| `result` | A search hit from Algolia HN | Always (the standard dataset row) |
| `thread_comment` | A comment in a story's reply tree | When `expandThreads: true` |
| `thread_summary` | Aggregate summary of an expanded thread (count, sentiment, top themes) | When `expandThreads: true` (one per parent) |
| `trend` | A rising keyword/n-gram with stage + reason | When `detectTrends: true` |

Filter downstream with `WHERE recordType = 'result'` (or `'trend'`, etc.) for clean routing in SQL, Sheets, or LLM tool calls.

#### Best-results guidance

For best monitoring results:

- Use `searchType: "date"` to prioritize fresh signal over historical relevance ranking.
- Use exact brand names in quotes — `"\"Acme Corp\""` not `acme`.
- Enable `includeAuthorProfile` to filter out low-karma noise.
- Use `alertMode: "smart"` for noisy queries — only signal-score-≥-50 mentions reach the webhook.
- Schedule daily, not hourly — HN moves fast but daily cadence captures everything important without alert fatigue.

For research and trend detection:

- Use `detectTrends: true` with `trendWindowDays: 7` for week-over-week trend signals.
- Bump `trendMinMentions` to 5+ on broad queries to filter out one-off noise.
- Pair with `includeInsights: true` to see sentiment and themes alongside the trends.

For thread expansion (research mode):

- Keep `maxResults` low (1–10) when `expandThreads: true` — you'll get hundreds of comment records per parent.
- Set `threadMaxDepth: 2` for shallow context, `5` for deep dives.
- Set `GITHUB_TOKEN` env var if you also enable GitHub enrichment to avoid the 60/hr rate limit.

***

### Why use Hacker News Search?

Hacker News is the highest-signal developer community on the web. Millions of posts on software, startups, AI, and policy — but the native search is basic and the Algolia API is bare-bones. Most third-party HN monitoring tools (Brand24, Mention, Syften) charge **$50–$100 per month**. This actor delivers the same job for **$0.005 per result** plus an intelligence layer those tools don't ship at all:

1. **Signal Score (0–100) on every result** — composite of engagement (40%), velocity (25%), author influence (20%), and recency (15%). One field tells you whether a mention matters; sort by it, filter by it, route alerts on it.
2. **Velocity scoring** — `pointsPerHour`, `commentsPerHour`, and an `isTrending` boolean for every item. Identify Show HN posts catching fire before they hit the front page.
3. **Author influence scoring** — 0–100 score from karma + account age + submission count, so you can filter out low-reputation noise.
4. **Smart alerts** — `alertMode: "smart"` routes to your Slack/Discord webhook only when signal score ≥ 50. No more "your brand was mentioned in a 0-point comment by a 3-day-old account" notifications.
5. **One-click modes** — `brand_monitor`, `competitor_tracking`, `hiring_intelligence`, `show_hn_analysis` pre-configure the actor for the job. No 20-field configuration screen.
6. **Daily brand-mention monitoring** — schedule it, get only the new mentions since last run, formatted for Slack/Discord webhooks.
7. **Author reputation enrichment** — karma, account age, submission count for every result.
8. **GitHub repo signals** — stars, primary language, last-push date when a result links to a repo.
9. **"Who Is Hiring" parser** — structured extraction of company, location, remote mode, and apply URL from the monthly HN hiring threads.
10. **Show HN traction analytics** — aggregate report (count, average points/comments/signal, top 5) saved to the run's key-value store.
11. **Query expansion** — type `"AI"` and the actor automatically searches `"artificial intelligence"`, `"AI"`, and `"LLM"`, deduplicating by HN object ID.

Sort by relevance or date. Restrict to specific content types. Set point and comment thresholds. Scope to date ranges. Filter by author. All structured. All cheap. All scheduled-friendly.

***

### Key features

#### Intelligence layer (always-on)

- **Signal Score (0–100)** — composite of engagement, velocity, author influence, and recency. Sort the dataset by it for the highest-signal results first. `signalLevel` (`high` / `medium` / `low`) for one-glance filtering.
- **Velocity scoring** — `pointsPerHour`, `commentsPerHour`, and an `isTrending` boolean (true when < 24h old AND ≥ 5 pts/hr or ≥ 2 comments/hr).
- **Author influence score** — 0–100 derived from karma (50%), account age (25%), submissions (25%), all log-normalized. Available when `includeAuthorProfile: true`.
- **`whyThisMatters`** — plain-English explanation of why a result is high-signal, generated deterministically from the contributing fields. Null on low-signal results.
- **`suggestedAction`** — `engage` / `investigate` / `monitor` / `ignore`. Decision-tier output that bridges data → action.
- **`feedbackType`** — heuristic classification: `complaint` / `feature_request` / `praise` / `question`. Built from regex patterns on comment + story text.
- **`influencerTier` + `isInfluencerMention`** — tiers the author influence score: `top_1_percent` (≥90), `top_10_percent` (≥70), `active` (≥40), `new` (<40). The boolean fires on top-10%-or-better.

#### Analysis modes

- **Trend detection** — `detectTrends: true` runs two date-bounded searches (current `trendWindowDays` window + the previous equal-length window), extracts 1/2/3-grams from titles + story bodies + comments, and surfaces rising terms with `trendScore` (40% growth + 30% mentions + 20% avg signal + 10% unique authors). Writes a `TREND_SUMMARY` key-value record AND pushes top trends as `recordType: 'trend'` dataset records.
- **Historical compare** — `compareMode: previous_period` auto-shifts dateFrom/dateTo back by the same length for period B. `compareMode: explicit` uses four date inputs. Outputs a `COMPARISON_SUMMARY` KV record with delta metrics + topRisingTerms + topDecliningTerms.
- **Thread expansion** — `expandThreads: true` walks the reply tree of every story result via the HN Firebase API and emits each comment as a separate `recordType: 'thread_comment'` dataset record. `threadMaxDepth` (default 3) and `threadMaxComments` (default 100) cap the recursion. Bundled in the existing per-result charge — no extra event.
- **Heuristic insights** — `includeInsights: true` adds `insightSummary`, `sentiment` (bullish/bearish/mixed/neutral), `riskLevel` (high/medium/low), and `keyThemes` array to every result. Pure regex + keyword matching; no LLM.
- **Adaptive auto-pagination** — `autoSplitLargeQueries: true` halves the date range when an Algolia query would exceed 900 hits, fetching each bucket separately and deduping. Capped by `maxSplitRuns` (default 20).
- **GitHub correlation** — `correlateGithub: true` adds `githubFreshness` (active/recent/stale/dormant), `githubRepoMaturity` (nascent/emerging/established/mature), and a composite `githubSignal` (high/medium/low) on top of the basic stars/language/pushedAt enrichment.

#### Search + filtering

- **Full-text search** across stories, comments, polls, Show HN, Ask HN, and front-page posts via the Algolia HN API
- **Sort by relevance or date** — best-match for research, newest-first for monitoring
- **Content type filtering** — stories, comments, polls, Show HN, Ask HN, or front page
- **Engagement thresholds** — minimum points, minimum comments
- **Date range filtering** — `YYYY-MM-DD` start + end (UTC)
- **Author filter** — find every post and comment by a specific HN username
- **Up to 1,000 results per run** with automatic pagination (50 hits per page)
- **Query expansion** — `expandQuery: true` runs short forms like `"AI"`, `"k8s"`, `"agents"` against their canonical synonyms (`"artificial intelligence"`, `"Kubernetes"`, `"autonomous agents"`) and deduplicates results

#### Modes & output levels

- **One-click modes** — `mode: "brand_monitor"` / `"competitor_tracking"` / `"hiring_intelligence"` / `"show_hn_analysis"` configure the actor for common jobs. Your explicit input fields always win over the preset.
- **Output levels** — `outputLevel: "basic" | "enriched" | "intelligence"` is shorthand for the enrichment toggle bundle.

#### Monitoring & alerts

- **Daily brand-mention monitor** — `alertOnNewOnly: true` tracks IDs across runs and only outputs new mentions; pair with `alertWebhookUrl` to push a Slack/Discord alert
- **Smart alerts** — `alertMode: "smart"` filters webhook payloads to mentions with signal score ≥ 50 only. Cuts low-quality noise from the alert channel; raw data still appears in the dataset.

#### Enrichments (opt-in)

- **Author profile** — `includeAuthorProfile: true` adds karma, account age (days), submission count, and 0–100 influence score via the HN Firebase API
- **GitHub repo signals** — `enrichGithubLinks: true` adds stars, primary language, last-push timestamp when a result links to a GitHub repository
- **Who Is Hiring parser** — `parseHiringComments: true` extracts company, location, remote mode, and apply URL from comment bodies
- **Show HN traction summary** — auto-fires when `tags: show_hn`, writes count, average points/comments/signal, and top 5 to the `SHOW_HN_SUMMARY` key-value record

#### Reliability & cost

- **Built-in retry + circuit breakers** — Algolia 5xx and network blips retry with backoff; enrichment loops disable themselves after 5 consecutive failures so a dead upstream never burns your credit
- **No HN API key required** — works out of the box; optional `GITHUB_TOKEN` env var raises the GitHub rate limit from 60/hr to 5,000/hr
- **Multiple export formats** — JSON, CSV, Excel, XML, HTML from the Apify dataset

***

### Pricing (pay-per-event)

Pay only for what you actually fetch. Two events:

| Event | Price | When it fires |
|-------|-------|---------------|
| `apify-actor-start` | **$0.00005** | Once when each run starts |
| `story-fetched` | **$0.005** | Once per Hacker News result returned |

A 100-result search costs **$0.50005**. A 1,000-result search costs **$5.00005**. A daily brand-monitor that finds 5 new mentions per day costs **$0.78 per month**. Apify's spending-limit settings cap your bill — when you hit them, the actor stops mid-run cleanly.

The Algolia HN API itself is free, so there are no external data fees. The Apify Free plan includes $5 of monthly platform credits, which covers hundreds of HN searches at no extra cost.

***

### How to use Hacker News Search

#### Using the Apify Console

1. Go to the [Hacker News Search actor page](https://apify.com/ryanclinton/hackernews-search) on Apify.
2. Click **Try for free** to open the actor in the Console.
3. Enter your search query (e.g., `artificial intelligence`, `"large language models"`, `Rust programming`).
4. Choose your sort order — **Relevance** for best matches or **Date (newest first)** for recent content.
5. Optionally toggle enrichments: author profile, GitHub links, Who Is Hiring parser.
6. For monitoring: enable **Alert on new results only** and paste your **Slack / webhook URL**.
7. Set your maximum results (default 100, up to 1,000).
8. Click **Start** and wait for the run to finish.
9. Switch to the **Dataset** tab to preview, download, or export results.

#### Scheduling for daily brand monitoring

1. Configure inputs once: query, `alertOnNewOnly: true`, `alertWebhookUrl: https://hooks.slack.com/services/...`.
2. Save the configuration as an Apify **task**.
3. Open **Schedules** in the Apify Console, point it at the task, and choose your cadence (`0 9 * * *` for 09:00 UTC daily).
4. Each run posts only the new mentions since the prior run to your webhook. The first run primes the state and posts nothing.

#### Using the API

You can start a run programmatically and retrieve results via the Apify API. See the [API & Integration](#api--integration) section below for ready-to-use Python, JavaScript, and cURL examples.

***

### Input parameters

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `query` | String | Yes | `artificial intelligence` | Search query to find on Hacker News |
| `mode` | String | No | `search` | Pre-configured workflow: `search`, `brand_monitor`, `competitor_tracking`, `hiring_intelligence`, `show_hn_analysis`. Your explicit fields always win over the preset. |
| `outputLevel` | String | No | `basic` | Shorthand for enrichment toggles: `basic` / `enriched` / `intelligence`. Intelligence scoring runs on every result regardless. |
| `searchType` | String | No | `relevance` | Sort order: `relevance` (best matches) or `date` (newest first) |
| `tags` | String | No | *(all types)* | Content type filter: `story`, `comment`, `poll`, `show_hn`, `ask_hn`, or `front_page` |
| `author` | String | No | *(any)* | Filter results by HN username (case-sensitive) |
| `minPoints` | Integer | No | *(none)* | Minimum number of upvotes/points |
| `minComments` | Integer | No | *(none)* | Minimum number of comments |
| `dateFrom` | String | No | *(none)* | Start date in `YYYY-MM-DD` format |
| `dateTo` | String | No | *(none)* | End date in `YYYY-MM-DD` format |
| `maxResults` | Integer | No | `100` | Maximum deduplicated results to return. Default 100. Algolia HN caps at 1,000 hits per single query — values up to 10,000 are accepted but only useful when `autoSplitLargeQueries: true`. The actor logs a warning if exceeded without auto-split. |
| `expandQuery` | Boolean | No | `false` | Expands known short forms (e.g., `"AI"` → `"artificial intelligence"`, `"AI"`, `"LLM"`) and dedupes results. Triples API calls when active. |
| `includeAuthorProfile` | Boolean | No | `false` | Adds karma, account age (days), submission count, and 0–100 author influence score |
| `enrichGithubLinks` | Boolean | No | `false` | When a result links to a GitHub repo, adds stars, language, and last-push timestamp |
| `correlateGithub` | Boolean | No | `false` | Adds `githubFreshness` / `githubRepoMaturity` / `githubSignal` classifications. Auto-enables `enrichGithubLinks`. |
| `parseHiringComments` | Boolean | No | `false` | Extracts company / location / remote mode / apply URL from "Who Is Hiring" comments |
| `expandThreads` | Boolean | No | `false` | Walks the reply tree of each story result and emits `recordType: 'thread_comment'` records |
| `threadMaxDepth` | Integer | No | `3` | Max recursion depth for thread expansion |
| `threadMaxComments` | Integer | No | `100` | Hard cap on total thread comments emitted per run |
| `includeInsights` | Boolean | No | `false` | Adds `insightSummary` / `sentiment` / `riskLevel` / `keyThemes` per result (heuristic, no LLM) |
| `detectTrends` | Boolean | No | `false` | Runs current vs previous N-day windows, extracts rising n-grams, writes `TREND_SUMMARY` + `recordType: 'trend'` records |
| `trendWindowDays` | Integer | No | `7` | Window length (days) for each side of the trend comparison |
| `trendMinMentions` | Integer | No | `3` | Minimum current-window mentions for a term to qualify as a trend |
| `trendMinGrowthPercent` | Integer | No | `100` | Minimum growth % vs baseline (100 = doubled) |
| `trendMaxTerms` | Integer | No | `50` | Maximum trends to surface |
| `compareMode` | String | No | `none` | `none` / `previous_period` (auto-shift) / `explicit` (use the four `compareDate*` inputs) |
| `compareDateFromA` / `ToA` / `FromB` / `ToB` | String | No | *(none)* | Explicit period dates when `compareMode: explicit` |
| `autoSplitLargeQueries` | Boolean | No | `false` | Recursively halves the date range when a query exceeds 900 hits, fetching each bucket separately |
| `maxSplitRuns` | Integer | No | `20` | Maximum date-range buckets to fetch in auto-split mode |
| `alertOnNewOnly` | Boolean | No | `false` | Tracks IDs across runs and only outputs items new since the last run |
| `alertWebhookUrl` | String (secret) | No | *(none)* | Slack/Discord/HTTP webhook URL — POSTs new mentions when `alertOnNewOnly` is enabled |
| `alertMode` | String | No | `all` | `all` posts every new mention; `smart` filters to signal score ≥ 50 |

#### Modes (one-click workflows)

Pick a `mode` and the actor configures itself for the job. Your explicit input fields always win over the preset.

| Mode | What it sets | Job |
|------|--------------|-----|
| `search` | Nothing — flexible defaults | Default; you configure everything |
| `discover` | `tags=front_page`, `searchType=date`, `detectTrends=true`, `includeInsights=true` | Zero-input front-page exploration with trends + insights pre-applied (clear the query for the full feed) |
| `brand_monitor` | `searchType=date`, `alertOnNewOnly=true`, `includeAuthorProfile=true`, `alertMode=all` | Daily brand alerts to Slack/Discord |
| `competitor_tracking` | Same as brand\_monitor BUT `alertMode=smart` | Smart-filtered alerts (only signal ≥ 50) |
| `hiring_intelligence` | `tags=comment`, `author=whoishiring`, `parseHiringComments=true`, `maxResults=500` | Monthly Who Is Hiring → structured jobs |
| `show_hn_analysis` | `tags=show_hn`, `searchType=date`, `enrichGithubLinks=true` | Show HN traction snapshots |

#### Output levels

Shorthand for enrichment toggles:

| Level | Effect |
|-------|--------|
| `basic` | Raw search results only |
| `enriched` | Auto-enables `includeAuthorProfile` + `enrichGithubLinks` (where not explicitly set) |
| `intelligence` | Same as `enriched` (intelligence scoring runs on every result regardless) |

#### What is signalScore?

`signalScore` is a 0–100 metric that ranks how important a Hacker News mention is. It is the actor's signature output field. Higher = more important.

It combines four components, log-normalized so single outliers cannot dominate:

| Component | Weight | What it measures |
|-----------|--------|------------------|
| Engagement | 40% | `log10(points + 2 × comments)` — saturates around 1,000 weighted engagement |
| Velocity | 25% | `log10(pointsPerHour)` — saturates around 100 pts/hr |
| Author influence | 20% | Author's 0–100 influence score (or 0.3 baseline if not enriched) |
| Recency | 15% | Linear decay over 168 hours (1 week) |

`signalLevel` tiers it: `high` (≥70), `medium` (40–69), `low` (<40). Sort by `signalScore DESC` for the highest-leverage results first.

#### What is trendScore?

`trendScore` is a 0–100 metric on `recordType: 'trend'` records that ranks how strongly a keyword is rising. It combines:

| Component | Weight | What it measures |
|-----------|--------|------------------|
| Growth rate | 40% | Percentage growth in mentions vs the baseline window |
| Current mentions | 30% | `log10(mentionsCurrent)` — absolute volume |
| Avg signal score | 20% | How high-quality the mentions are |
| Unique authors | 10% | `log10(uniqueAuthors)` — breadth of adoption |

Pair with `trendStage` (`emerging` / `rising` / `peaked` / `declining`) to know whether a trend is just starting, accelerating, plateauing, or fading.

#### What is suggestedAction?

`suggestedAction` is the actor's decision-tier output. It tells you what to do with a result:

| Value | When it fires | What to do |
|-------|---------------|------------|
| `engage` | High signal (≥50) + question / feature\_request / praise | Reply, draft a response, follow up |
| `investigate` | High signal (≥50) + bearish sentiment / high risk / complaint | Open a ticket, route to support or PM |
| `monitor` | Medium signal | Watch for escalation; no immediate action |
| `ignore` | Signal score < 25 | Skip — low-value or off-topic |

This is what makes the actor LLM-agent-native: downstream automations branch on `suggestedAction` directly, with no parsing of prose.

#### JSON input example — basic search

```json
{
    "query": "large language models",
    "searchType": "date",
    "tags": "story",
    "minPoints": 50,
    "minComments": 10,
    "dateFrom": "2026-01-01",
    "dateTo": "2026-12-31",
    "maxResults": 200
}
```

#### JSON input example — daily brand monitor with Slack alerts

```json
{
    "query": "MyProduct",
    "searchType": "date",
    "alertOnNewOnly": true,
    "alertWebhookUrl": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXX",
    "includeAuthorProfile": true,
    "maxResults": 100
}
```

Schedule this with cron `0 9 * * *` and you get a daily Slack message at 09:00 UTC listing only the mentions you haven't seen before. `includeAuthorProfile` adds the karma of each commenter so you can spot the high-signal voices.

#### JSON input example — Who Is Hiring parser

```json
{
    "query": "remote",
    "tags": "comment",
    "author": "whoishiring",
    "parseHiringComments": true,
    "maxResults": 500
}
```

The HN account `whoishiring` posts the monthly hiring thread on the first weekday of each month. This run pulls every "remote" comment from that thread and parses it into structured columns: `hiringCompany`, `hiringLocation`, `hiringRemote`, `hiringApplyUrl`. Drops straight into a recruiting CRM.

#### JSON input example — Show HN traction snapshot

```json
{
    "query": "AI",
    "tags": "show_hn",
    "searchType": "date",
    "dateFrom": "2026-04-01",
    "maxResults": 100
}
```

This run pulls every recent Show HN post about AI. After the dataset is written, the actor saves an aggregate summary (count, average points, average comments, top 5 by points) to the run's `SHOW_HN_SUMMARY` key-value record. Open it from the **Storage → Key-value store** tab in the Apify Console.

#### JSON input example — trend detection (rising keywords)

```json
{
    "query": "AI",
    "detectTrends": true,
    "trendWindowDays": 7,
    "trendMinMentions": 3,
    "trendMinGrowthPercent": 100,
    "maxResults": 200
}
```

Runs two date-bounded searches (last 7 days vs the previous 7 days), extracts 1/2/3-grams from titles + story bodies + comments, filters out stop words and HN boilerplate, and writes a `TREND_SUMMARY` key-value record + pushes top-20 trends as `recordType: 'trend'` dataset records. Each trend carries `mentionsCurrent`, `mentionsPrevious`, `growthPercent`, `avgSignalScore`, `uniqueAuthors`, and a composite `trendScore` (0–100).

#### JSON input example — historical comparison

```json
{
    "query": "rust",
    "compareMode": "explicit",
    "compareDateFromA": "2026-04-01",
    "compareDateToA": "2026-04-30",
    "compareDateFromB": "2026-03-01",
    "compareDateToB": "2026-03-31",
    "maxResults": 400
}
```

Fetches the same query against two date ranges, writes a `COMPARISON_SUMMARY` KV record with `mentionsDelta`, `mentionsGrowthPercent`, `avgSignalScoreDelta`, `topRisingTerms`, and `topDecliningTerms`. Use `compareMode: "previous_period"` to auto-shift backward by the same length without specifying period B explicitly.

#### JSON input example — thread expansion (research mode)

```json
{
    "query": "rust async",
    "tags": "story",
    "minPoints": 100,
    "expandThreads": true,
    "threadMaxDepth": 3,
    "threadMaxComments": 100,
    "maxResults": 5
}
```

For each of the top 5 matching stories, fetches the entire reply tree via the HN Firebase API (capped at depth 3 and 100 total thread comments per run) and emits each comment as a separate `recordType: 'thread_comment'` record with `storyId`, `commentId`, `parentId`, `depth`, `author`, `text`, `createdAt`, and `hnUrl`. Bundled in the existing per-result charge — no new event.

#### JSON input example — adaptive auto-pagination (>1000 results)

```json
{
    "query": "kubernetes",
    "dateFrom": "2025-01-01",
    "dateTo": "2026-04-30",
    "autoSplitLargeQueries": true,
    "maxSplitRuns": 20,
    "maxResults": 5000
}
```

When a date range would exceed Algolia's 1,000-hit cap, the actor recursively halves the range until each bucket is below 900 hits, fetches each bucket independently, and dedupes by HN object ID across buckets.

#### JSON input example — GitHub-enriched developer story search

```json
{
    "query": "Rust",
    "tags": "story",
    "minPoints": 100,
    "enrichGithubLinks": true,
    "maxResults": 50
}
```

When a story's submitted URL is a GitHub repo, you get the star count, primary language, and last-pushed timestamp inline. Set the `GITHUB_TOKEN` environment variable in the actor's run options to raise the GitHub API rate limit from 60/hr to 5,000/hr.

#### Tips

- Leave `tags` empty to search across all content types.
- Combine `minPoints` and `minComments` to surface only high-engagement discussions.
- Use `searchType: "date"` with `dateFrom` / `dateTo` for chronological feeds.
- Wrap your query in double quotes for exact phrase matching: `"machine learning"`.
- For brand monitoring, narrow `query` to a unique brand name (`"Acme Corp"`, not `acme`).
- Start with a small `maxResults` value (10–20) to test filters before scaling up.
- Pair `parseHiringComments: true` with `author: "whoishiring"` to get clean monthly job feeds.

***

### Output

Each result is pushed to the default Apify dataset as a JSON object:

```json
{
    "recordType": "result",
    "objectID": "39281042",
    "title": "Show HN: Open-source LLM benchmark for real-world coding tasks",
    "url": "https://github.com/example/llm-benchmark",
    "author": "techfounder",
    "points": 342,
    "numComments": 87,
    "createdAt": "2026-04-15T14:23:01.000Z",
    "type": "show_hn",
    "storyText": null,
    "commentText": null,
    "parentId": null,
    "storyId": null,
    "hnUrl": "https://news.ycombinator.com/item?id=39281042",
    "signalScore": 87.2,
    "signalLevel": "high",
    "velocityScore": 0.71,
    "pointsPerHour": 18.4,
    "commentsPerHour": 4.7,
    "isTrending": true,
    "authorKarma": 12450,
    "authorAccountAgeDays": 4520,
    "authorSubmissionCount": 187,
    "authorInfluenceScore": 78.4,
    "isInfluencerMention": true,
    "influencerTier": "top_10_percent",
    "githubStars": 3400,
    "githubLanguage": "Rust",
    "githubPushedAt": "2026-04-12T08:14:22Z",
    "feedbackType": null,
    "whyThisMatters": "High-signal mention from a high-influence author with trending velocity (18.4 pts/hr) discussing developer-experience and ai with positive reception.",
    "suggestedAction": "engage"
}
```

#### Output fields

| Field | Type | Description |
|-------|------|-------------|
| `objectID` | String | Unique Hacker News item ID |
| `title` | String or null | Post title (null for comments) |
| `url` | String or null | External link URL (null for text posts and comments) |
| `author` | String | HN username of the poster |
| `points` | Number | Number of upvotes |
| `numComments` | Number | Number of comments |
| `createdAt` | String | ISO 8601 timestamp |
| `type` | String | Item type: `story`, `comment`, `poll`, `show_hn`, or `ask_hn` |
| `storyText` | String or null | Body text for Ask HN and text-only posts |
| `commentText` | String or null | Comment body (only for comment results) |
| `parentId` | String or null | Parent item ID (for comments) |
| `storyId` | String or null | Top-level story ID (for comments) |
| `hnUrl` | String | Direct link to the HN discussion |
| `recordType` | String | Always `"result"` for search hits. Reserved for future record types. |
| `signalScore` | Number 0–100 | Composite signal score (engagement 40% + velocity 25% + author influence 20% + recency 15%) |
| `signalLevel` | String | Tier of signalScore: `high` (≥70) / `medium` (40–69) / `low` (<40) |
| `velocityScore` | Number 0–1 | Log-normalized engagement velocity (saturates around 100 pts/hr) |
| `pointsPerHour` | Number | Points earned per hour since posting (capped at 168-hour window) |
| `commentsPerHour` | Number | Comments per hour since posting (capped at 168-hour window) |
| `isTrending` | Boolean | True when item is < 24h old AND ≥ 5 pts/hr OR ≥ 2 comments/hr |
| `authorKarma` | Number or null | Author's HN karma — only present when `includeAuthorProfile: true` |
| `authorAccountAgeDays` | Number or null | Days since the author's HN account was created |
| `authorSubmissionCount` | Number or null | Total stories + comments the author has submitted |
| `authorInfluenceScore` | Number 0–100 or null | Composite of karma (50%) + account age (25%) + submissions (25%) |
| `githubStars` | Number or null | Star count when `url` is a GitHub repo and `enrichGithubLinks: true` |
| `githubLanguage` | String or null | Primary language of the linked GitHub repo |
| `githubPushedAt` | String or null | ISO timestamp of the last commit pushed to the linked repo |
| `hiringCompany` | String or null | Company parsed from a Who Is Hiring comment when `parseHiringComments: true` |
| `hiringLocation` | String or null | Location parsed from a hiring comment |
| `hiringRemote` | String or null | Remote / Hybrid / On-site flag parsed from a hiring comment |
| `hiringApplyUrl` | String or null | Apply link or `mailto:` address parsed from a hiring comment |
| `whyThisMatters` | String or null | Plain-English reason this result is high-signal (deterministic, built from contributing fields) |
| `suggestedAction` | String or null | `engage` / `investigate` / `monitor` / `ignore` — decision tier |
| `feedbackType` | String or null | `complaint` / `feature_request` / `praise` / `question` — heuristic classification |
| `isInfluencerMention` | Boolean or null | True when author influence score is ≥ 70 (top 10%) |
| `influencerTier` | String or null | `top_1_percent` / `top_10_percent` / `active` / `new` |

#### Show HN summary output

When you search with `tags: "show_hn"`, the actor additionally writes one aggregate record to the run's key-value store under the key `SHOW_HN_SUMMARY`:

```json
{
    "type": "show_hn_summary",
    "query": "AI",
    "totalPosts": 100,
    "avgPoints": 84,
    "avgComments": 41,
    "top5ByPoints": [
        { "title": "Show HN: ChatGPT Plus alternative…", "points": 612, "hnUrl": "https://news.ycombinator.com/item?id=…" }
    ]
}
```

Retrieve it from the **Storage → Key-value store → SHOW\_HN\_SUMMARY** tab, or via the API:
`https://api.apify.com/v2/key-value-stores/<storeId>/records/SHOW_HN_SUMMARY`.

***

### Use cases

- **Daily brand and product mention alerts** — schedule with `alertOnNewOnly: true` + Slack webhook to know the moment your name hits HN.
- **Competitor watch** — same setup, different query. Track each competitor on its own schedule.
- **Show HN traction tracking for makers** — daily snapshot of how Show HN posts in your category are trending.
- **Recruiting from "Who Is Hiring"** — monthly run on the `whoishiring` thread with `parseHiringComments: true` produces a clean leads CSV.
- **Influencer / expert tracking** — pair `author: "patio11"` (or any username) with `searchType: "date"` to follow specific high-signal HN users.
- **Technology trend monitoring** — track emerging topics with date ranges to see adoption curves.
- **Repo discovery** — `enrichGithubLinks: true` on a Rust/Python/Go query surfaces the highest-engagement repos developers are sharing right now.
- **Author credibility filtering** — `includeAuthorProfile: true` lets you ignore low-karma noise and focus on the voices the community trusts.
- **Content curation** — filter by points and comments to build a curated link feed.
- **Sentiment / NLP datasets** — collect comments about a topic for downstream sentiment scoring or topic modelling.
- **Academic research** — historical archive back to 2007 with date-range filtering.

***

### API & Integration

Run Hacker News Search programmatically and retrieve structured results via the Apify API. Replace `<YOUR_API_TOKEN>` with your [Apify API token](https://console.apify.com/settings/integrations).

#### Python

```python
from apify_client import ApifyClient

client = ApifyClient("<YOUR_API_TOKEN>")

run_input = {
    "query": "large language models",
    "searchType": "relevance",
    "tags": "story",
    "minPoints": 100,
    "includeAuthorProfile": True,
    "maxResults": 50,
}

run = client.actor("ytQ2q81fedyAGvCEJ").call(run_input=run_input)

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['title']} — {item['points']} points — karma {item.get('authorKarma')} — {item['hnUrl']}")
```

#### JavaScript

```javascript
import { ApifyClient } from "apify-client";

const client = new ApifyClient({ token: "<YOUR_API_TOKEN>" });

const input = {
    query: "MyProduct",
    searchType: "date",
    alertOnNewOnly: true,
    alertWebhookUrl: "https://hooks.slack.com/services/...",
    maxResults: 100,
};

const run = await client.actor("ytQ2q81fedyAGvCEJ").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.log(`${item.title} — ${item.points} pts — ${item.hnUrl}`);
});
```

#### cURL

```bash
## Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ytQ2q81fedyAGvCEJ/runs?token=<YOUR_API_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "large language models",
    "searchType": "relevance",
    "tags": "story",
    "minPoints": 100,
    "maxResults": 50
  }'

## Retrieve results from the dataset (use defaultDatasetId from the run response)
curl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?token=<YOUR_API_TOKEN>&format=json"
```

#### Slack webhook payload format

When `alertOnNewOnly: true` and `alertWebhookUrl` is set, the actor POSTs a Slack-compatible JSON payload to your webhook after each run that finds new mentions:

```json
{
    "text": "*3 new Hacker News mentions for \"MyProduct\"*\n• <https://news.ycombinator.com/item?id=…|Show HN: a faster MyProduct competitor> — 142 points, 38 comments\n…",
    "query": "MyProduct",
    "totalNew": 3,
    "items": [ /* up to 10 full result objects */ ]
}
```

This works as-is with Slack incoming webhooks, Discord webhooks, and any HTTP endpoint that accepts JSON. For Zapier or Make.com, paste your webhook URL into the field — the JSON body fires on every run with new mentions.

#### Integrations

- **Apify Schedules** — built-in cron scheduling for daily / hourly monitoring runs
- **Webhooks** — fire HTTP callbacks when a run finishes (separate from the alert webhook above)
- **Zapier / Make / n8n** — pipe results into 5,000+ apps via the alert webhook or the Apify integration
- **Google Sheets** — export the dataset directly to a sheet for collaborative review
- **Slack / Discord** — paste your incoming-webhook URL into `alertWebhookUrl` for native channel alerts
- **Python SDK** — official [Apify Python client](https://docs.apify.com/sdk/python)
- **JavaScript SDK** — official [Apify JS client](https://docs.apify.com/sdk/js)

***

### Use in Dify

Drop Hacker News Intelligence into [Dify](https://docs.apify.com/platform/integrations/dify) workflows via the Apify plugin's Run Actor node. Each result returns scored, classified, and recommended as structured JSON — `recordType` enum (`result` / `thread_comment` / `thread_summary` / `trend` / `trend_summary` / `comparison_summary`), `signalScore` (0-100), `feedbackType` enum (`complaint` / `feature_request` / `praise` / `question`), `suggestedAction` enum (`engage` / `investigate` / `monitor` / `ignore`), and `trendStage` enum (`emerging` / `rising` / `peaked` / `declining`) your downstream node branches on. The raw Algolia HN API returns posts; this returns prioritised developer signals.

- **Actor ID:** `ryanclinton/hackernews-search`
- **Sample input** (brand-monitoring with thread-summary mode):

```json
{
    "query": "your-product-name",
    "enrichTopThreads": true,
    "monitorStateKey": "brand-monitor-q2"
}
```

- **Branching example** — a Dify if/else node reads `recordType` first, then routes per-type:
  - `recordType = "result"` AND `suggestedAction = "engage"` → notify community-team Slack with the comment thread + `feedbackType` for context
  - `recordType = "result"` AND `feedbackType = "complaint"` AND `signalScore > 70` → page support team + create Zendesk ticket
  - `recordType = "result"` AND `feedbackType = "feature_request"` AND `signalScore > 60` → product backlog
  - `recordType = "trend"` AND `trendStage = "emerging"` → notify growth/marketing team
  - `recordType = "thread_summary"` → pipe directly to Slack as a daily digest
- **For developer feedback mining**: filter `recordType = "result"` AND `feedbackType IN ("complaint", "feature_request")` to surface only actionable signal — Dify routes complaints to support, feature requests to product
- **For trend detection**: filter `recordType = "trend_summary"` and read `top5ByPoints[]` — Dify alerts when a new topic enters the top-5 emerging trend list
- **Cross-run alerts**: pass `monitorStateKey` and the actor flags new high-signal threads since last run — Dify branches only on the deltas, not the full result set

The `suggestedAction` enum + `signalScore` make this drop-in for any Dify automation that needs to triage Hacker News mentions of a product, person, or technology — branch on `engage` for sales/community, `investigate` for product, `monitor` for marketing, `ignore` for archive.

***

### How it works

Hacker News Search uses the official Algolia HN Search API, which indexes the entire Hacker News archive in near real-time. The actor constructs API queries from your input, paginates with a 1-second polite delay, classifies each result by content type, optionally enriches via the HN Firebase API + the GitHub API, optionally diffs against a named key-value store for the brand monitor, and writes structured output to the dataset.

1. **Input validation** — reads input, clamps `maxResults` to 1–1,000 (or 1–10,000 with `autoSplitLargeQueries: true`), normalizes filters.
2. **Endpoint selection** — `relevance` → `hn.algolia.com/api/v1/search`; `date` → `/search_by_date`.
3. **Query construction** — builds the full URL with query, tag filters, numeric filters, and pagination.
4. **Paginated fetching with retry** — fetches 50 hits per page; retries 5xx and network errors up to 3 times with linear backoff; waits 1 second between pages.
5. **Type detection** — inspects the `_tags` array on each hit (priority: comment → poll → show\_hn → ask\_hn → story).
6. **Per-item enrichment** (when toggles are on):
   - `includeAuthorProfile` → fetch `hacker-news.firebaseio.com/v0/user/{username}.json` (cached per run)
   - `enrichGithubLinks` → match `github.com/owner/repo` URLs and fetch `api.github.com/repos/{owner}/{repo}` (cached per run)
   - `parseHiringComments` → run regex over the comment body to extract company / location / remote / apply URL
7. **Brand-monitor diff** (when `alertOnNewOnly: true`) — load prior run's IDs from the named key-value store `hackernews-search-monitor`, skip seen items, save the merged ID set (FIFO 10,000 cap) before exit.
8. **Output transformation** — every item normalized to the dataset schema with camelCase field names, null-safe values, and a constructed `hnUrl`.
9. **Webhook alert** (when `alertWebhookUrl` is set and new mentions exist) — POST a Slack-compatible payload.
10. **Show HN summary** (when `tags: "show_hn"`) — write the aggregate record to the `SHOW_HN_SUMMARY` key-value record.

```
                  Hacker News Search — Pipeline

  +-------------+     +---------------------+     +-------------------+
  | User Input  |---->| Query Construction  |---->| Algolia HN API    |
  | (14 fields) |     | tags + filters +    |     | /search or        |
  +-------------+     | dates + pagination  |     | /search_by_date   |
                      +---------------------+     +-------------------+
                                                            |
                                                            v
  +-------------+     +---------------------+     +-------------------+
  | Webhook +   |<----| Per-item Enrichment |<----| Pagination + retry|
  | Show HN     |     | + Brand-monitor     |     | (5xx + 429 backoff|
  | summary +   |     | dedup + Algolia     |     |  3 attempts)      |
  | Dataset     |     | _tags type detect   |     |                   |
  +-------------+     +---------------------+     +-------------------+
                                ^
                                | (optional)
                          +-----+-----+   +-------------------+
                          | HN user   |   | GitHub repo API   |
                          | Firebase  |   | (stars, lang,     |
                          | API       |   |  pushed_at)       |
                          +-----------+   +-------------------+
```

***

### Performance & cost

| Scenario | Results | Run time | PPE charges |
|----------|---------|---------|-------------|
| Quick test | 10 | ~3 s | $0.05005 |
| Default search | 100 | ~5 s | $0.50005 |
| Medium search | 250 | ~10 s | $1.25005 |
| Large search | 500 | ~15 s | $2.50005 |
| Maximum search | 1,000 | ~30 s | $5.00005 |
| Daily brand monitor (5 new) | 5 | ~3 s | $0.02505 / day = $0.78 / month |
| Show HN snapshot (100 + summary KV) | 100 | ~5 s | $0.50005 |

PPE charges = `apify-actor-start ($0.00005)` + `story-fetched × N ($0.005 each)`. Apify platform-compute charges are billed separately by Apify based on RAM-seconds (this actor defaults to 256 MB, the lightest workable tier for HN-scale traffic).

The Algolia HN API is free. Author profile and GitHub enrichment use free public APIs — they don't add to your PPE bill, and the actor's circuit breakers stop calling them after 5 consecutive failures so a dead upstream never burns your time.

***

### Limitations

- **Maximum 1,000 results per run** — hard limit imposed by the Algolia HN API. For larger datasets, run multiple searches with non-overlapping date ranges.
- **50 results per API page** — pagination is automatic, but a 1,000-result search makes ~20 sequential API calls.
- **Algolia indexing delay** — very new posts (last few minutes) may not yet appear in search results.
- **Comment text is plain text** — HTML formatting from original HN comments is stripped by the Algolia API.
- **Author filter is case-sensitive** — usernames must match exactly as they appear on HN.
- **No Boolean query operators** — query is plain text. Algolia HN does not support `AND` / `OR` / `NOT` syntax.
- **Single author per run** — to track multiple authors, run the actor separately for each.
- **Date filtering granularity** — `dateFrom` is midnight UTC, `dateTo` is 23:59:59 UTC. Sub-day precision is not available.
- **Rate limiting enforced** — built-in 1-second delay between pages. Removing this is not recommended.
- **Who Is Hiring parser is best-effort** — comment formats vary; expect ~80% extraction accuracy on company/location, lower on tech-stack heuristics. Always review before downstream automation.
- **GitHub enrichment unauthenticated rate limit is 60/hr** — set the `GITHUB_TOKEN` environment variable to raise it to 5,000/hr.

***

### What this actor does NOT do

- **It does not crawl or render JavaScript.** It only calls the Algolia HN API and the HN Firebase + GitHub APIs. There is no browser, no scraping, no JS execution.
- **It does not detect official live front-page rank.** The actor computes velocity (`pointsPerHour`, `commentsPerHour`, `isTrending`) from each item's posting time, but the Algolia API doesn't expose live HN front-page position — for exact "currently #3 on HN" tracking, pair this actor with a Firebase-based front-page poller.
- **It does not run LLM-based sentiment or topic classification.** The `includeInsights: true` toggle adds heuristic sentiment + theme detection via keyword regex — it's deterministic, fast, and free, but it is not AI sentiment. For nuanced sentiment, plug the raw text into your own LLM pipeline.
- **It does not deduplicate near-duplicate submissions.** If the same article was posted three times by three users, you get three results — use `objectID` to dedupe at the application layer.
- **It does not bypass HN guidelines.** Public data, polite rate, attribution-friendly. If you publish derivative analysis, credit Hacker News (Y Combinator).
- **It does not aggregate Reddit, Lobsters, or other social platforms.** This actor is HN-focused. For multi-platform developer-community signal aggregation, see the planned "Developer Signal Monitor" actor.

If you need any of the above, see the **Related actors** table at the bottom for sibling tools, or open an issue on [the actor's GitHub](https://github.com/apify) — feature requests with concrete use cases regularly ship.

***

### Responsible use

This actor accesses publicly available data through the official Algolia HN Search API, which is provided specifically for programmatic access. Please use it responsibly:

- **Respect rate limits** — the actor enforces a 1-second delay between API pages.
- **Retrieve only what you need** — use filters and reasonable `maxResults` values.
- **Respect user privacy** — HN usernames and posts are public, but aggregating personal activity should be done thoughtfully and in compliance with GDPR / CCPA.
- **Attribute your sources** — credit Hacker News (Y Combinator) in any published analysis.
- **Review terms of service** — see [Hacker News guidelines](https://news.ycombinator.com/newsguidelines.html) and the [Algolia HN Search API documentation](https://hn.algolia.com/api).

***

### FAQ

**Q: Do I need an API key to use this actor?**
A: No. The Algolia HN Search API is free and open. No HN API key or authentication is required. You only need an Apify account.

**Q: How much does it cost?**
A: Pay-per-event: **$0.00005 per run start** + **$0.005 per result fetched**. A 100-result search costs about 50 cents. A daily brand monitor that finds 5 new mentions per day costs about 78 cents per month.

**Q: How do I set up daily Slack alerts for my brand on Hacker News?**
A: Set `alertOnNewOnly: true`, paste your Slack incoming-webhook URL into `alertWebhookUrl`, and schedule the actor to run daily via Apify Schedules. The first run primes the state and posts nothing; every subsequent run posts only mentions you haven't seen before.

**Q: Does it work with Discord webhooks?**
A: Yes. The payload uses Slack's `text` field, which Discord webhooks render as a message. You can also paste any HTTP endpoint that accepts JSON.

**Q: How accurate is the "Who Is Hiring" parser?**
A: It uses deterministic regex over the comment body and is best-effort. Expect ~80% accuracy on company name and location, and lower on remote-mode and apply-URL extraction (formats vary wildly across the thread). Review the parsed columns before downstream automation. The raw `commentText` is always preserved so you can re-parse manually.

**Q: How far back does the data go?**
A: The Algolia index covers essentially the entire Hacker News archive, going back to 2007. Use `dateFrom` and `dateTo` to scope to any time period.

**Q: How do I get more than 1,000 results from a single query?**
A: The Algolia API caps at 1,000 per query. Split the search across non-overlapping date ranges (e.g., one run per month) and concatenate the datasets.

**Q: Can I run this on a schedule?**
A: Yes. Use Apify's built-in scheduling. Combine with `searchType: "date"` and `alertOnNewOnly: true` for a clean monitoring feed.

**Q: How do I raise the GitHub enrichment rate limit?**
A: Set the `GITHUB_TOKEN` environment variable in the actor's run options to a [GitHub personal access token](https://github.com/settings/tokens) with `public_repo` scope. The unauthenticated limit is 60/hr (per IP); authenticated is 5,000/hr (per token).

**Q: What happens when an enrichment API is down?**
A: The actor tracks consecutive failures separately for the HN Firebase API and the GitHub API. After 5 consecutive failures on either, that enrichment disables itself for the rest of the run. Main results keep flowing — you don't pay for a dead upstream.

**Q: Can I search for an exact phrase?**
A: Yes. Wrap your query in double quotes: `"machine learning"`.

**Q: How does trend detection work?**
A: With `detectTrends: true`, the actor runs two date-bounded searches — the current `trendWindowDays` window and the previous equal-length window. It tokenizes titles + story bodies + comments into 1/2/3-grams, filters out stop words and HN boilerplate, counts occurrences in each window, and computes growth percent. Terms below `trendMinMentions` or `trendMinGrowthPercent` are dropped. The remaining trends are scored (40% growth + 30% mentions + 20% avg signal + 10% unique authors) and the top `trendMaxTerms` are surfaced in `TREND_SUMMARY` (KV) and as `recordType: 'trend'` dataset records.

**Q: How does thread expansion work?**
A: For each story, Show HN, Ask HN, or poll result, the actor fetches that item from the HN Firebase API (`hacker-news.firebaseio.com/v0/item/{id}.json`) and BFS-walks its `kids` array up to `threadMaxDepth`. Each comment is emitted as a separate `recordType: 'thread_comment'` dataset record. The `threadMaxComments` cap is enforced across all parents in the run, so a single very-deep thread can't exhaust the budget. Thread comments are bundled in the existing per-result charge — no additional event.

**Q: What's the difference between `compareMode: previous_period` and `detectTrends`?**
A: `compareMode: previous_period` compares whatever you specified in `dateFrom`/`dateTo` against the equal-length window immediately before. `detectTrends` always uses *now* as the end of the current window and looks back `trendWindowDays`. Use compare for ad-hoc "this month vs last month" reports; use trends for "what's rising right now."

**Q: How does `discover` mode work without a query?**
A: Discover mode pre-applies `tags: front_page` + `searchType: date` + `detectTrends: true` + `includeInsights: true`. If you leave the query empty, you get the full HN front page; set a query to filter front-page items by topic. Schedule it daily for a "what's hot today on HN" feed.

**Q: What does `whyThisMatters` look like in practice?**
A: It's a single sentence built deterministically from the result's other fields — examples: "High-signal mention from a high-influence author with trending velocity (18 pts/hr) discussing developer-experience and ai with positive reception." or "Moderate-signal mention from an experienced user discussing security with concerns raised (complaint)." Null on results below `signalScore: 25` to avoid noise.

**Q: How accurate is `feedbackType` classification?**
A: It's deterministic regex against curated keyword patterns — accurate enough to triage automatically, not accurate enough to ship to customers without review. Expect ~80% accuracy on clear-cut cases (`broken`, `wish you'd add X`, `love this`). For ambiguous mixed feedback, classification falls through to `null` — better honest absence than confident wrong answer.

**Q: Why does the dataset overview view lead with `suggestedAction`?**
A: Because Apify Console previews are most users' first impression. Leading with `suggestedAction` + `signalLevel` + `whyThisMatters` means a customer can scan 5 rows and immediately see what to investigate, what to engage with, and what to ignore — without clicking into individual records or exporting to a spreadsheet.

**Q: Is the `includeInsights` sentiment AI-powered?**
A: No. It's heuristic: a curated bullish/bearish word list and a domain theme dictionary (performance, cost, developer-experience, security, reliability, scalability, open-source, AI, lock-in). Pure regex + keyword counting, deterministic, free of hallucinations, free at runtime. For nuanced sentiment, pipe the raw `commentText` into your own LLM pipeline.

**Q: Why are some output fields null?**
A: Fields like `title`, `url`, `storyText`, `commentText`, `parentId`, `storyId` are null when they don't apply to the content type. Comments have no `title`; stories have no `commentText`. Enrichment fields (`authorKarma`, `githubStars`, `hiringCompany`, etc.) are null when the corresponding toggle is off, when the data isn't available, or when the enrichment circuit-breaker has fired. Thread/trend-specific fields (`depth`, `text`, `term`, `growthPercent`, etc.) are only populated on `recordType: 'thread_comment'` and `recordType: 'trend'` records respectively.

**Q: How does the brand-monitor remember which IDs it has seen?**
A: The actor opens a named key-value store called `hackernews-search-monitor` and writes one record per query slug containing up to 10,000 prior `objectID`s in FIFO order. Each scheduled run loads the prior IDs, skips already-seen items, and saves the merged set on exit. Different queries get separate state, so you can run multiple monitors in parallel.

**Q: Does this actor scrape the news.ycombinator.com website?**
A: No. It only calls the Algolia HN Search API and (optionally) the HN Firebase + GitHub APIs. There is no browser automation, no HTML parsing, no rate-limit risk against HN itself.

**Q: Can I run this offline or self-hosted?**
A: The actor is open-runtime — it requires the Apify platform to handle PPE billing, scheduling, and key-value state. The Algolia HN API itself is free and you could replicate the search logic in any HTTP client, but features like brand-monitor state, scheduled alerts, and PPE billing are Apify-specific.

***

### Summary

Hacker News Intelligence is the most complete way to analyze Hacker News data without building your own pipeline. It is a developer sentiment monitoring tool, a Hacker News trend detection tool, and a social listening tool for developers — focused on high-signal discussions. It turns raw discussions into ranked, explainable, actionable insights, and tells you what to do about each one.

***

### Related actors

If you find Hacker News Search useful, check these related tools for the developer-community and web-monitoring stack:

| Actor | Description |
|-------|-------------|
| [Stack Overflow & StackExchange Search](https://apify.com/ryanclinton/stackexchange-search) | Search questions and answers across the entire StackExchange network |
| [GitHub Repository Search](https://apify.com/ryanclinton/github-repo-search) | Search GitHub repositories by keyword, language, stars, and more |
| [Bluesky Social Search](https://apify.com/ryanclinton/bluesky-social-search) | Search posts and profiles on the Bluesky social network |
| [Brand Protection Monitor](https://apify.com/ryanclinton/brand-protection-monitor) | Monitor brand mentions and potential infringements across the web |
| [Website Change Monitor](https://apify.com/ryanclinton/website-change-monitor) | Track changes on any website and get notified of updates |
| [Wayback Machine Search](https://apify.com/ryanclinton/wayback-machine-search) | Search the Internet Archive's Wayback Machine for historical snapshots |
| [CrossRef Paper Search](https://apify.com/ryanclinton/crossref-paper-search) | Search the academic literature via the CrossRef API |

# Actor input Schema

## `query` (type: `string`):

Search query to find on Hacker News

## `mode` (type: `string`):

Pre-configured workflow. `search` is the default flexible mode. The other modes set sensible defaults so common jobs are one-click — your explicit input fields always win over the preset.

## `outputLevel` (type: `string`):

Shorthand for enabling enrichment toggles. `basic` = raw search results. `enriched` = author profile + GitHub enrichment auto-on. `intelligence` = same as enriched (intelligence scoring runs on every result regardless).

## `searchType` (type: `string`):

Sort results by relevance or date

## `tags` (type: `string`):

Filter by content type (leave empty for all types)

## `author` (type: `string`):

Filter by author username (case-sensitive)

## `minPoints` (type: `integer`):

Minimum number of upvotes/points

## `minComments` (type: `integer`):

Minimum number of comments

## `dateFrom` (type: `string`):

Start date in YYYY-MM-DD format

## `dateTo` (type: `string`):

End date in YYYY-MM-DD format

## `maxResults` (type: `integer`):

Maximum number of deduplicated results to return. Default 100. Algolia HN caps at 1,000 hits per single query; values up to 10,000 are accepted but only useful when `autoSplitLargeQueries` is enabled (the actor warns if exceeded without it).

## `expandQuery` (type: `boolean`):

When the query matches a known short form (e.g. "AI", "ML", "k8s", "agents"), runs additional searches for the canonical synonyms and dedupes results by HN object ID. Triples the API calls when active — cap maxResults appropriately.

## `includeAuthorProfile` (type: `boolean`):

Adds karma, account age (days), submission count, AND a 0–100 author influence score for every author. Uses the official HN Firebase API.

## `enrichGithubLinks` (type: `boolean`):

When a result links to a GitHub repository, adds star count, primary language, and last-push timestamp. Set the GITHUB\_TOKEN environment variable to raise the unauthenticated rate limit (60/hr) to 5,000/hr.

## `correlateGithub` (type: `boolean`):

Adds repository freshness (active / recent / stale / dormant), maturity (nascent / emerging / established / mature), and a composite signal tier (high / medium / low). Auto-enables `enrichGithubLinks`.

## `parseHiringComments` (type: `boolean`):

When fetching comments (combine with Content Type = Comments and a query like 'who is hiring'), extracts company, location, remote/on-site mode, and apply URL from each listing.

## `expandThreads` (type: `boolean`):

For each story, Show HN, Ask HN, or poll result, fetch the full reply tree from the HN Firebase API and emit each comment as a separate `recordType: 'thread_comment'` dataset record. Bundled in the existing per-result charge (no extra event).

## `threadMaxDepth` (type: `integer`):

Maximum reply depth to walk (1 = direct replies only).

## `threadMaxComments` (type: `integer`):

Hard cap on total thread\_comment records emitted across all parents in this run, to keep dataset size bounded.

## `includeInsights` (type: `boolean`):

Adds an `insightSummary` string, `sentiment` (bullish / bearish / mixed / neutral), `riskLevel` (high / medium / low), and `keyThemes` array per result. Pure regex + keyword matching — no LLM, no external API.

## `detectTrends` (type: `boolean`):

Run two date-bounded searches (current `trendWindowDays` window vs the previous equal-length window), extract 1/2/3-grams from titles + story bodies + comments, and compute growth rates. Writes a `TREND_SUMMARY` key-value record and pushes top trends as `recordType: 'trend'` dataset records.

## `trendWindowDays` (type: `integer`):

Length of the comparison window for trend detection. Each side uses this many days (current vs previous).

## `trendMinMentions` (type: `integer`):

Minimum mention count in the current window for a term to be considered a trend.

## `trendMinGrowthPercent` (type: `integer`):

Minimum growth percentage versus the baseline window (e.g. 100 = at least doubled).

## `trendMaxTerms` (type: `integer`):

Maximum number of trending terms to surface.

## `compareMode` (type: `string`):

Run two searches and compute deltas. `none` (default) skips comparison. `previous_period` uses dateFrom/dateTo as period A and shifts back by the same length for period B. `explicit` uses the four `compareDateFromA/ToA/FromB/ToB` inputs.

## `compareDateFromA` (type: `string`):

Start date for period A (YYYY-MM-DD). Used when compareMode = explicit.

## `compareDateToA` (type: `string`):

End date for period A (YYYY-MM-DD).

## `compareDateFromB` (type: `string`):

Start date for period B (YYYY-MM-DD).

## `compareDateToB` (type: `string`):

End date for period B (YYYY-MM-DD).

## `autoSplitLargeQueries` (type: `boolean`):

Adaptively halve the date range when an Algolia query would exceed 900 hits, fetching each bucket separately and deduping by HN object ID. Requires both dateFrom and dateTo. Capped by `maxSplitRuns` to prevent runaway pagination.

## `maxSplitRuns` (type: `integer`):

Maximum number of date-range buckets to fetch when auto-split is active.

## `alertOnNewOnly` (type: `boolean`):

Tracks the IDs seen on previous runs of this query and only outputs items that are new since the last run. Use this with an Apify schedule to build a daily brand-mention or competitor monitor.

## `alertWebhookUrl` (type: `string`):

When set together with 'Alert on new results only', POSTs a Slack-compatible JSON payload to this URL after each run. Works with Slack incoming webhooks, Discord, or any HTTP endpoint. Treat as a secret — Slack/Discord webhook URLs grant posting access to a channel.

## `alertMode` (type: `string`):

`all` posts every new mention to the webhook. `smart` filters the webhook to mentions with signal score ≥ 50.

## Actor input object example

```json
{
  "query": "artificial intelligence",
  "mode": "search",
  "outputLevel": "basic",
  "searchType": "relevance",
  "maxResults": 100,
  "expandQuery": false,
  "includeAuthorProfile": false,
  "enrichGithubLinks": false,
  "correlateGithub": false,
  "parseHiringComments": false,
  "expandThreads": false,
  "threadMaxDepth": 3,
  "threadMaxComments": 100,
  "includeInsights": false,
  "detectTrends": false,
  "trendWindowDays": 7,
  "trendMinMentions": 3,
  "trendMinGrowthPercent": 100,
  "trendMaxTerms": 50,
  "compareMode": "none",
  "autoSplitLargeQueries": false,
  "maxSplitRuns": 20,
  "alertOnNewOnly": false,
  "alertMode": "all"
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "query": "artificial intelligence"
};

// Run the Actor and wait for it to finish
const run = await client.actor("ryanclinton/hackernews-search").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "query": "artificial intelligence" }

# Run the Actor and wait for it to finish
run = client.actor("ryanclinton/hackernews-search").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "query": "artificial intelligence"
}' |
apify call ryanclinton/hackernews-search --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=ryanclinton/hackernews-search",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Hacker News Search — Stories, Comments & Developer Sentiment",
        "description": "Search and extract stories, comments, polls, Show HN, and Ask HN posts from Hacker News. This actor uses the Algolia HN Search API to find content by keyword, filter by author, date range, minimum points, and comment count -- then returns clean, structured JSON ready for analysis, monitoring, or ...",
        "version": "1.1",
        "x-build-id": "CbmmrY8GtlcaNz7DX"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/ryanclinton~hackernews-search/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-ryanclinton-hackernews-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/ryanclinton~hackernews-search/runs": {
            "post": {
                "operationId": "runs-sync-ryanclinton-hackernews-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/ryanclinton~hackernews-search/run-sync": {
            "post": {
                "operationId": "run-sync-ryanclinton-hackernews-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "query"
                ],
                "properties": {
                    "query": {
                        "title": "Search Query",
                        "type": "string",
                        "description": "Search query to find on Hacker News",
                        "default": "artificial intelligence"
                    },
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "search",
                            "discover",
                            "brand_monitor",
                            "competitor_tracking",
                            "hiring_intelligence",
                            "show_hn_analysis"
                        ],
                        "type": "string",
                        "description": "Pre-configured workflow. `search` is the default flexible mode. The other modes set sensible defaults so common jobs are one-click — your explicit input fields always win over the preset.",
                        "default": "search"
                    },
                    "outputLevel": {
                        "title": "Output Level",
                        "enum": [
                            "basic",
                            "enriched",
                            "intelligence"
                        ],
                        "type": "string",
                        "description": "Shorthand for enabling enrichment toggles. `basic` = raw search results. `enriched` = author profile + GitHub enrichment auto-on. `intelligence` = same as enriched (intelligence scoring runs on every result regardless).",
                        "default": "basic"
                    },
                    "searchType": {
                        "title": "Search Type",
                        "enum": [
                            "relevance",
                            "date"
                        ],
                        "type": "string",
                        "description": "Sort results by relevance or date",
                        "default": "relevance"
                    },
                    "tags": {
                        "title": "Content Type",
                        "enum": [
                            "",
                            "story",
                            "comment",
                            "poll",
                            "show_hn",
                            "ask_hn",
                            "front_page"
                        ],
                        "type": "string",
                        "description": "Filter by content type (leave empty for all types)"
                    },
                    "author": {
                        "title": "Author",
                        "type": "string",
                        "description": "Filter by author username (case-sensitive)"
                    },
                    "minPoints": {
                        "title": "Minimum Points",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Minimum number of upvotes/points"
                    },
                    "minComments": {
                        "title": "Minimum Comments",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Minimum number of comments"
                    },
                    "dateFrom": {
                        "title": "Date From",
                        "type": "string",
                        "description": "Start date in YYYY-MM-DD format"
                    },
                    "dateTo": {
                        "title": "Date To",
                        "type": "string",
                        "description": "End date in YYYY-MM-DD format"
                    },
                    "maxResults": {
                        "title": "Maximum Results",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of deduplicated results to return. Default 100. Algolia HN caps at 1,000 hits per single query; values up to 10,000 are accepted but only useful when `autoSplitLargeQueries` is enabled (the actor warns if exceeded without it).",
                        "default": 100
                    },
                    "expandQuery": {
                        "title": "Expand short queries to synonyms",
                        "type": "boolean",
                        "description": "When the query matches a known short form (e.g. \"AI\", \"ML\", \"k8s\", \"agents\"), runs additional searches for the canonical synonyms and dedupes results by HN object ID. Triples the API calls when active — cap maxResults appropriately.",
                        "default": false
                    },
                    "includeAuthorProfile": {
                        "title": "Enrich author profiles",
                        "type": "boolean",
                        "description": "Adds karma, account age (days), submission count, AND a 0–100 author influence score for every author. Uses the official HN Firebase API.",
                        "default": false
                    },
                    "enrichGithubLinks": {
                        "title": "Enrich GitHub links",
                        "type": "boolean",
                        "description": "When a result links to a GitHub repository, adds star count, primary language, and last-push timestamp. Set the GITHUB_TOKEN environment variable to raise the unauthenticated rate limit (60/hr) to 5,000/hr.",
                        "default": false
                    },
                    "correlateGithub": {
                        "title": "GitHub correlation (freshness, maturity, signal)",
                        "type": "boolean",
                        "description": "Adds repository freshness (active / recent / stale / dormant), maturity (nascent / emerging / established / mature), and a composite signal tier (high / medium / low). Auto-enables `enrichGithubLinks`.",
                        "default": false
                    },
                    "parseHiringComments": {
                        "title": "Parse Who Is Hiring comments",
                        "type": "boolean",
                        "description": "When fetching comments (combine with Content Type = Comments and a query like 'who is hiring'), extracts company, location, remote/on-site mode, and apply URL from each listing.",
                        "default": false
                    },
                    "expandThreads": {
                        "title": "Expand comment threads",
                        "type": "boolean",
                        "description": "For each story, Show HN, Ask HN, or poll result, fetch the full reply tree from the HN Firebase API and emit each comment as a separate `recordType: 'thread_comment'` dataset record. Bundled in the existing per-result charge (no extra event).",
                        "default": false
                    },
                    "threadMaxDepth": {
                        "title": "Thread Max Depth",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Maximum reply depth to walk (1 = direct replies only).",
                        "default": 3
                    },
                    "threadMaxComments": {
                        "title": "Thread Max Comments per Run",
                        "minimum": 1,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Hard cap on total thread_comment records emitted across all parents in this run, to keep dataset size bounded.",
                        "default": 100
                    },
                    "includeInsights": {
                        "title": "Heuristic insights (sentiment, themes, risk)",
                        "type": "boolean",
                        "description": "Adds an `insightSummary` string, `sentiment` (bullish / bearish / mixed / neutral), `riskLevel` (high / medium / low), and `keyThemes` array per result. Pure regex + keyword matching — no LLM, no external API.",
                        "default": false
                    },
                    "detectTrends": {
                        "title": "Detect trending keywords",
                        "type": "boolean",
                        "description": "Run two date-bounded searches (current `trendWindowDays` window vs the previous equal-length window), extract 1/2/3-grams from titles + story bodies + comments, and compute growth rates. Writes a `TREND_SUMMARY` key-value record and pushes top trends as `recordType: 'trend'` dataset records.",
                        "default": false
                    },
                    "trendWindowDays": {
                        "title": "Trend Window (days)",
                        "minimum": 1,
                        "maximum": 90,
                        "type": "integer",
                        "description": "Length of the comparison window for trend detection. Each side uses this many days (current vs previous).",
                        "default": 7
                    },
                    "trendMinMentions": {
                        "title": "Trend Min Mentions",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Minimum mention count in the current window for a term to be considered a trend.",
                        "default": 3
                    },
                    "trendMinGrowthPercent": {
                        "title": "Trend Min Growth %",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Minimum growth percentage versus the baseline window (e.g. 100 = at least doubled).",
                        "default": 100
                    },
                    "trendMaxTerms": {
                        "title": "Trend Max Terms",
                        "minimum": 1,
                        "maximum": 500,
                        "type": "integer",
                        "description": "Maximum number of trending terms to surface.",
                        "default": 50
                    },
                    "compareMode": {
                        "title": "Compare Mode",
                        "enum": [
                            "none",
                            "previous_period",
                            "explicit"
                        ],
                        "type": "string",
                        "description": "Run two searches and compute deltas. `none` (default) skips comparison. `previous_period` uses dateFrom/dateTo as period A and shifts back by the same length for period B. `explicit` uses the four `compareDateFromA/ToA/FromB/ToB` inputs.",
                        "default": "none"
                    },
                    "compareDateFromA": {
                        "title": "Period A: Date From",
                        "type": "string",
                        "description": "Start date for period A (YYYY-MM-DD). Used when compareMode = explicit."
                    },
                    "compareDateToA": {
                        "title": "Period A: Date To",
                        "type": "string",
                        "description": "End date for period A (YYYY-MM-DD)."
                    },
                    "compareDateFromB": {
                        "title": "Period B: Date From",
                        "type": "string",
                        "description": "Start date for period B (YYYY-MM-DD)."
                    },
                    "compareDateToB": {
                        "title": "Period B: Date To",
                        "type": "string",
                        "description": "End date for period B (YYYY-MM-DD)."
                    },
                    "autoSplitLargeQueries": {
                        "title": "Auto-split queries beyond 1,000 hits",
                        "type": "boolean",
                        "description": "Adaptively halve the date range when an Algolia query would exceed 900 hits, fetching each bucket separately and deduping by HN object ID. Requires both dateFrom and dateTo. Capped by `maxSplitRuns` to prevent runaway pagination.",
                        "default": false
                    },
                    "maxSplitRuns": {
                        "title": "Max Split Runs",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of date-range buckets to fetch when auto-split is active.",
                        "default": 20
                    },
                    "alertOnNewOnly": {
                        "title": "Alert on new results only (for scheduled runs)",
                        "type": "boolean",
                        "description": "Tracks the IDs seen on previous runs of this query and only outputs items that are new since the last run. Use this with an Apify schedule to build a daily brand-mention or competitor monitor.",
                        "default": false
                    },
                    "alertWebhookUrl": {
                        "title": "Slack / webhook URL for alerts",
                        "type": "string",
                        "description": "When set together with 'Alert on new results only', POSTs a Slack-compatible JSON payload to this URL after each run. Works with Slack incoming webhooks, Discord, or any HTTP endpoint. Treat as a secret — Slack/Discord webhook URLs grant posting access to a channel."
                    },
                    "alertMode": {
                        "title": "Alert Mode",
                        "enum": [
                            "all",
                            "smart"
                        ],
                        "type": "string",
                        "description": "`all` posts every new mention to the webhook. `smart` filters the webhook to mentions with signal score ≥ 50.",
                        "default": "all"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
