# Korea’s Open Government Data Portal (data.go.kr) (`parseforge/data-go-kr-scraper`) Actor

Extract comprehensive dataset listings from Korea’s Open Government Data Portal (data.go.kr), including metadata, descriptions, API details, and organization info. Supports filtering by dataset type, organization, keywords, and other parameters, enabling automated access to 50,000+ datasets.

- **URL**: https://apify.com/parseforge/data-go-kr-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Automation, Other
- **Stats:** 2 total users, 0 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://github.com/ParseForge/apify-assets/blob/ad35ccc13ddd068b9d6cba33f323962e39aed5b2/banner.jpg?raw=true)

## 🇰🇷 Korean Data Portal Scraper

> 🚀 **Extract dataset metadata from South Korea's official open data portal in minutes.** Search by keyword, filter by type and recommendation status. No coding, no authentication required.

> 🕒 **Last updated:** 2026-04-23 · **📊 20 fields** · **📂 50,000+ public datasets** · **🔍 Keyword and URL-based search**


<table><tr>
<td style="border-left:4px solid #0F766E;padding:12px 16px;font-weight:600">Pull structured records from Korea Open Government Data Portal — clean fields ready as CSV, JSON, JSONL, Excel, or XML for downstream pipelines.</td>
</tr></table>

<table>
<tr>
<td colspan="3" style="padding:10px 14px;background:#0F766E;border:none;border-radius:4px 4px 0 0">
<span style="color:#FFFFFF;font-size:14px;font-weight:700;letter-spacing:0.5px">Related Scrapers</span>
</td>
</tr>
<tr>
<td style="padding:10px 14px;border:1px solid #E7E5E4;border-top:none;vertical-align:top;width:33%;background:#CCFBF1">
&nbsp;<a href="https://apify.com/parseforge/data-go-kr-scraper" style="color:#0F766E;text-decoration:none;font-weight:700;font-size:13px">Korea data.go.kr</a><br>
<span style="color:#0F766E;font-size:11px;font-weight:600">➸ You are here</span>
</td>
<td style="padding:10px 14px;border:1px solid #E7E5E4;border-top:none;vertical-align:top;width:33%">
&nbsp;<a href="https://apify.com/parseforge/data-gov-scraper" style="color:#1C1917;text-decoration:none;font-weight:700;font-size:13px">USA data.gov</a><br>
<span style="color:#78716C;font-size:11px">US federal open data</span>
</td>
<td style="padding:10px 14px;border:1px solid #E7E5E4;border-top:none;vertical-align:top;width:33%">
&nbsp;<a href="https://apify.com/parseforge/data-gov-uk-scraper" style="color:#1C1917;text-decoration:none;font-weight:700;font-size:13px">UK data.gov.uk</a><br>
<span style="color:#78716C;font-size:11px">UK open gov data</span>
</td>
</tr>
</table>

##### Copy to your AI assistant

Copy this block into ChatGPT, Claude, Cursor, or any LLM to start using this actor.

````

parseforge/data-go-kr-scraper on Apify. Call: ApifyClient("TOKEN").actor("parseforge/data-go-kr-scraper").call(run\_input={...}), then client.dataset(run\["defaultDatasetId"]).list\_items().items for results. Key inputs: startUrl (string), maxItems (integer, default 10), keyword (string, default "weather"), svcType (string), recmSe (string), conditionType (string). Full actor spec: fetch build via GET https://api.apify.com/v2/acts/parseforge~data-go-kr-scraper (Bearer TOKEN). Get token: https://console.apify.com/account/integrations

````

South Korea's data.go.kr is the country's official government open data portal, hosting over **50,000 public datasets** from hundreds of government agencies. This scraper collects structured dataset metadata including titles, descriptions, organization names, file formats, download URLs, API access links, view counts, download statistics, and contact information. It supports keyword search, direct URL scraping, and filtering by dataset type (FILE, API, STD, LINKED).

Whether you are a researcher analyzing Korean government data availability, a business looking for market data, or a developer building applications with Korean public datasets, this actor delivers structured metadata for up to **1,000,000 datasets per run** for paid users. Every record includes the dataset title, publishing organization, description, creation and update dates, file formats, download links, and usage metrics. The data exports as JSON, CSV, or Excel.

| 🎯 Target Audience | 💡 Use Cases |
|---|---|
| Data researchers | Discover Korean government datasets by topic |
| Business analysts | Find economic and market data from Korean agencies |
| Developers | Locate API-based datasets for application integration |
| Journalists | Access public data for investigative reporting |
| Urban planners | Find geographic and demographic data for Korea |
| Academic researchers | Build bibliographies of Korean government data sources |

---

### 📋 What the Korean Data Portal Scraper does

- 🔍 **Keyword search** across dataset titles and descriptions in Korean and English
- 🔗 **Direct URL scraping** from any data.go.kr search results page
- 📂 **Dataset type filtering** for downloadable files, APIs, standards, and linked data
- ⭐ **Recommendation filtering** to show only portal-recommended datasets
- 📊 **Usage metrics** including view counts and download statistics
- 📥 **Download URLs** and API access links for direct data retrieval

The scraper queries the data.go.kr portal with your search parameters, retrieves matching dataset listings, and extracts full metadata for each entry. Results include dataset IDs, titles, descriptions, publishing organizations, dataset types, creation and update dates, file formats, download links, API documentation, categories, tags, contact information, license terms, and usage statistics. Each record is timestamped.

> 💡 **Why it matters:** Browsing data.go.kr manually is slow, especially if you need to survey datasets across multiple topics or agencies. This scraper automates discovery and delivers structured metadata ready for cataloging, analysis, or application development.

---

### 🎬 Full Demo

_🚧 Coming soon..._

---

### ⚙️ Input

<table>
<tr><th>Field</th><th>Type</th><th>Required</th><th>Description</th></tr>
<tr><td><b>startUrl</b></td><td>string</td><td>No</td><td>Direct URL from data.go.kr search results</td></tr>
<tr><td><b>maxItems</b></td><td>integer</td><td>No</td><td>Max datasets to collect. Free: up to 10. Paid: up to 1,000,000</td></tr>
<tr><td><b>keyword</b></td><td>string</td><td>No</td><td>Search term for datasets (e.g., "weather", "population")</td></tr>
<tr><td><b>svcType</b></td><td>string</td><td>No</td><td>Dataset type: FILE, API, STD, or LINKED</td></tr>
<tr><td><b>recmSe</b></td><td>string</td><td>No</td><td>Recommendation filter: Y (recommended only) or N (all)</td></tr>
<tr><td><b>conditionType</b></td><td>string</td><td>No</td><td>Search condition: init (default) or search</td></tr>
<tr><td><b>kwrdArray</b></td><td>string</td><td>No</td><td>Comma-separated keywords for advanced search</td></tr>
</table>

**Example 1: Basic keyword search**
```json
{
  "keyword": "weather",
  "maxItems": 10
}
````

**Example 2: Filtered search for downloadable file datasets**

```json
{
  "keyword": "population",
  "svcType": "FILE",
  "recmSe": "Y",
  "maxItems": 50
}
```

> ⚠️ **Good to Know:** Use either a Start URL or search filters, not both. If you provide a Start URL, search filters are ignored. The keyword search works with both Korean and English terms, though Korean terms typically return more results.

***

### 📊 Output

#### 🧾 Schema

| Emoji | Field | Type | Description |
|---|---|---|---|
| 🆔 | datasetId | string | Unique dataset identifier |
| 📝 | title | string | Official dataset title |
| 🏢 | organization | string | Publishing government agency |
| 📋 | description | string | Dataset description |
| 📊 | datasetType | string | Data format type (FILE, API, STD, LINKED) |
| 📅 | createdDate | string | When the dataset was first published |
| 📅 | lastUpdated | string | Most recent update date |
| 📈 | viewCount | number | Total page views |
| 📈 | downloadCount | number | Total downloads |
| 🏷️ | categories | array | Topic classifications |
| 🏷️ | tags | array | Keyword tags |
| 🔗 | datasetUrl | string | Direct link to the dataset page |
| 📥 | downloadUrl | string | File download URL |
| 💾 | fileFormat | string | Available file format |
| 📦 | fileSize | string | File size |
| 🔗 | dataAccessLink | string | API or data access endpoint |
| 📖 | documentation | string | API documentation link |
| 👤 | contactInfo | string | Responsible agency contact |
| ⚖️ | licenseInfo | string | Data usage license terms |
| ⚠️ | error | string | Error message if processing failed |

#### 📦 Sample records

<details>
<summary>📄 Weather dataset (FILE type)</summary>

```json
{
  "datasetId": "15000123",
  "title": "Daily Weather Observation Data",
  "organization": "Korea Meteorological Administration",
  "description": "Daily temperature, precipitation, and wind data from weather stations across South Korea",
  "datasetType": "FILE",
  "createdDate": "2019-03-15",
  "lastUpdated": "2026-04-01",
  "viewCount": 45230,
  "downloadCount": 12450,
  "categories": ["Weather", "Environment"],
  "tags": ["temperature", "precipitation", "climate"],
  "downloadUrl": "https://data.go.kr/download/15000123/weather_daily.csv",
  "fileFormat": "CSV",
  "fileSize": "245MB",
  "licenseInfo": "Public Domain",
  "scrapedAt": "2026-04-16T12:00:00.000Z"
}
```

</details>

<details>
<summary>📄 Population dataset (API type)</summary>

```json
{
  "datasetId": "15000456",
  "title": "Population Statistics by Region",
  "organization": "Statistics Korea",
  "description": "Monthly population data by city, province, and district",
  "datasetType": "API",
  "createdDate": "2020-01-10",
  "lastUpdated": "2026-03-30",
  "viewCount": 28900,
  "downloadCount": 8670,
  "categories": ["Demographics", "Society"],
  "dataAccessLink": "https://api.data.go.kr/openapi/population-stats",
  "documentation": "https://data.go.kr/docs/15000456",
  "contactInfo": "data@kostat.go.kr",
  "licenseInfo": "Korea Open Government License",
  "scrapedAt": "2026-04-16T12:00:00.000Z"
}
```

</details>

<details>
<summary>📄 Economic dataset (recommended)</summary>

```json
{
  "datasetId": "15000789",
  "title": "Monthly Trade Statistics",
  "organization": "Korea Customs Service",
  "description": "Import and export data by country, commodity, and port",
  "datasetType": "FILE",
  "createdDate": "2018-06-20",
  "lastUpdated": "2026-04-10",
  "viewCount": 67800,
  "downloadCount": 21340,
  "categories": ["Economy", "Trade"],
  "tags": ["imports", "exports", "trade balance"],
  "downloadUrl": "https://data.go.kr/download/15000789/trade_monthly.xlsx",
  "fileFormat": "XLSX",
  "fileSize": "180MB",
  "licenseInfo": "Korea Open Government License",
  "scrapedAt": "2026-04-16T12:00:00.000Z"
}
```

</details>

***

### ✨ Why choose this Actor

| Feature | Details |
|---|---|
| 🔍 Keyword and URL search | Search by topic or scrape any data.go.kr results page |
| 📂 4 dataset types | FILE, API, STD, and LINKED data formats |
| ⭐ Recommendation filter | Show only portal-recommended datasets |
| 📈 Usage metrics | View counts and download statistics for each dataset |
| 📥 Download URLs | Direct links to files and API endpoints |
| 🌐 Korean and English | Search with terms in both languages |
| 📦 Flexible export | JSON, CSV, or Excel output |

> 📊 **Discover and catalog up to 1,000,000 Korean government datasets per run with full metadata and download links.**

***

### 📈 How it compares to alternatives

| Feature | This Actor | Manual Browsing | Generic Scrapers |
|---|---|---|---|
| Keyword search (Korean + English) | ✅ | ✅ | ❌ |
| Dataset type filtering | ✅ | ✅ | ❌ |
| Usage metrics extraction | ✅ | Manual | ❌ |
| Download URL collection | ✅ | Manual | ❌ |
| Bulk collection (1M+ datasets) | ✅ | ❌ | ❌ |
| Structured JSON/CSV output | ✅ | ❌ | Varies |
| Scheduled runs | ✅ | ❌ | ❌ |

Automate your Korean government data discovery instead of browsing and recording datasets manually.

***

### 🚀 How to use

1. **Create an Apify account** - [Sign up free with $5 credit](https://console.apify.com/sign-up?fpr=vmoqkp)
2. **Open the Korean Data Portal Scraper** - Navigate to the actor page on Apify
3. **Enter a search keyword or URL** - Type a topic like "weather" or paste a data.go.kr search URL
4. **Add optional filters** - Set dataset type, recommendation status, or advanced keywords
5. **Click Start** - The actor collects matching datasets and delivers structured metadata

> ⏱️ **A typical run with 10 datasets completes in under 1 minute.**

***

### 💼 Business use cases

<table>
<tr>
<td width="50%"><b>📊 Data Research</b>
<ul>
<li>Discover available government datasets by topic</li>
<li>Catalog API endpoints for application development</li>
<li>Track new dataset publications across agencies</li>
<li>Compare data availability across Korean departments</li>
</ul>
</td>
<td width="50%"><b>💼 Business Intelligence</b>
<ul>
<li>Find economic and trade data for market analysis</li>
<li>Locate demographic data for customer segmentation</li>
<li>Access weather data for logistics planning</li>
<li>Monitor government data releases in your industry</li>
</ul>
</td>
</tr>
<tr>
<td width="50%"><b>🎓 Academic Research</b>
<ul>
<li>Build inventories of available Korean public data</li>
<li>Find datasets for Korean studies research</li>
<li>Track data quality and update frequency</li>
<li>Compile data source bibliographies for papers</li>
</ul>
</td>
<td width="50%"><b>📰 Journalism</b>
<ul>
<li>Find public data sources for investigative stories</li>
<li>Track dataset updates from specific agencies</li>
<li>Access environmental and health data for reporting</li>
<li>Monitor government transparency through data publication</li>
</ul>
</td>
</tr>
</table>

***

***

### 🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Empirical datasets for papers, thesis work, and coursework
- Longitudinal studies tracking changes across snapshots
- Reproducible research with cited, versioned data pulls
- Classroom exercises on data analysis and ethical scraping

</td>
<td width="50%">

#### 🎨 Personal and creative

- Side projects, portfolio demos, and indie app launches
- Data visualizations, dashboards, and infographics
- Content research for bloggers, YouTubers, and podcasters
- Hobbyist collections and personal trackers

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Transparency reporting and accountability projects
- Advocacy campaigns backed by public-interest data
- Community-run databases for local issues
- Investigative journalism on public records

</td>
<td width="50%">

#### 🧪 Experimentation

- Prototype AI and machine-learning pipelines with real data
- Validate product-market hypotheses before engineering spend
- Train small domain-specific models on niche corpora
- Test dashboard concepts with live input

</td>
</tr>
</table>

### 🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

- 💬 [**ChatGPT**](https://chat.openai.com/?q=How%20do%20I%20use%20the%20Korea%E2%80%99s%20Open%20Government%20Data%20Portal%20%28data.go.kr%29%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🧠 [**Claude**](https://claude.ai/new?q=How%20do%20I%20use%20the%20Korea%E2%80%99s%20Open%20Government%20Data%20Portal%20%28data.go.kr%29%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🔍 [**Perplexity**](https://perplexity.ai/search?q=How%20do%20I%20use%20the%20Korea%E2%80%99s%20Open%20Government%20Data%20Portal%20%28data.go.kr%29%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)
- 🅒 [**Copilot**](https://copilot.microsoft.com/?q=How%20do%20I%20use%20the%20Korea%E2%80%99s%20Open%20Government%20Data%20Portal%20%28data.go.kr%29%20by%20ParseForge%20on%20Apify%3F%20Show%20me%20input%20examples%2C%20output%20fields%2C%20common%20use%20cases%2C%20and%20how%20to%20integrate%20it%20into%20a%20workflow.)

### ❓ Frequently Asked Questions

<details>
<summary><b>💳 Do I need a paid Apify plan to run this actor?</b></summary>

No. You can start right now on the free Apify plan, which includes **$5 in free monthly credit**. That is enough to run this actor several times and explore the output before committing to anything. Paid plans unlock higher limits, more concurrent runs, and larger datasets. [Create a free Apify account here](https://console.apify.com/sign-up?fpr=vmoqkp) to get started.

</details>

<details>
<summary><b>🚨 What happens if my run fails or returns no results?</b></summary>

Failed runs are not charged. If the source site changes, proxies get rate-limited, or a specific input matches nothing, re-run the actor or open our [contact form](https://tally.so/r/BzdKgA) and we will investigate. You can also check the run log in the Apify console to see why the run stopped.

</details>

<details>
<summary><b>📏 How many items can I scrape per run?</b></summary>

Free users are limited to **10 items per run** so you can preview the output and confirm the actor works for your use case. Paid users can raise maxItems up to **1,000,000** per run. [Upgrade here](https://console.apify.com/sign-up?fpr=vmoqkp) if you need full scale.

</details>

<details>
<summary><b>🕒 How fresh is the data?</b></summary>

Every run fetches live data at the moment of execution. There is no cache or delay: the records you get reflect what the source returned at that moment. Schedule the actor to maintain a rolling snapshot of the data you need.

</details>

<details>
<summary><b>🧑‍💻 Can I call this actor from my own code?</b></summary>

Yes. Apify exposes every actor as a REST endpoint and ships first-class SDKs for [Node.js](https://docs.apify.com/sdk/js) and [Python](https://docs.apify.com/sdk/python). You can start a run, read the dataset, and handle webhooks from your own app in a few lines. All you need is your Apify API token.

</details>

<details>
<summary><b>📤 How do I export the data?</b></summary>

Every Apify dataset can be downloaded in one click from the console as CSV, JSON, JSONL, Excel, HTML, XML, or RSS. You can also pull results programmatically via the [Apify API](https://docs.apify.com/api/v2) or stream them into BigQuery, S3, and other destinations through built-in integrations.

</details>

<details>
<summary><b>📅 Can I schedule the actor to run automatically?</b></summary>

Yes. Use the Apify scheduler to run the actor on any cadence, from hourly to monthly. Results are saved to your dataset and can be delivered to webhooks, email, Slack, cloud storage, or automation tools such as Zapier and Make.

***

</details>

### 🔌 Automating Korean Data Portal Scraper

Integrate the Korean Data Portal Scraper into your workflow using the Apify API or client libraries.

**Node.js:**

```javascript
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor("parseforge/data-go-kr-scraper").call({
  keyword: "weather",
  svcType: "FILE",
  maxItems: 50
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);
```

**Python:**

```python
from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("parseforge/data-go-kr-scraper").call(run_input={
    "keyword": "weather",
    "svcType": "FILE",
    "maxItems": 50
})
items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
print(items)
```

- 📖 [Apify API reference](https://docs.apify.com/api/v2)
- 🐍 [Python client docs](https://docs.apify.com/api/client/python)
- 📦 [Node.js client docs](https://docs.apify.com/api/client/js)

**Schedules:** Set up recurring runs to monitor new dataset publications, track update frequency, or build a growing catalog of Korean government data. Configure daily, weekly, or monthly schedules from the Apify Console.

### 🔌 Integrate with any app

- 🔗 **Make (Integromat)** - Connect dataset metadata to Google Sheets, Notion, or any of 1,500+ apps
- 🔗 **Zapier** - Trigger workflows when new datasets are found
- 🔗 **Slack** - Get notified when new government datasets are published
- 🔗 **Airbyte** - Stream dataset metadata into your data warehouse
- 🔗 **GitHub** - Store dataset catalogs in repositories for version control
- 🔗 **Google Drive** - Automatically save CSV exports to shared folders

***

### 🔗 Recommended Actors

| Actor | Description |
|---|---|
| [US Census Bureau Scraper](https://apify.com/parseforge/us-census-bureau-scraper) | Extract demographic and economic data from the US Census Bureau |
| [NASA Reports Scraper](https://apify.com/parseforge/nasa-reports-scraper) | Collect technical reports from NASA's NTRS database |
| [Crossref Scraper](https://apify.com/parseforge/crossref-scraper) | Extract DOI metadata for 155M+ research publications |
| [Open Library Scraper](https://apify.com/parseforge/open-library-scraper) | Search and download book data from the Internet Archive |
| [ROR Scraper](https://apify.com/parseforge/ror-scraper) | Collect research organization data from ROR |

> 💡 **Pro Tip:** Combine the Korean Data Portal Scraper with the US Census Bureau Scraper to compare public data availability and coverage between Korean and US government portals.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) and we will get back to you within 24 hours. We are happy to help with custom setups, integrations, or feature requests.

***

> **Disclaimer:** This actor is not affiliated with, endorsed by, or connected to the Korean government or data.go.kr. It accesses publicly available data from South Korea's open data portal. Use responsibly and in accordance with applicable terms of service.

# Actor input Schema

## `startUrl` (type: `string`):

Direct URL to a search results page from data.go.kr. Use this OR search filters below, not both. To get a URL: 1) Go to https://www.data.go.kr/en/tcs/dss/selectDataSetList.do, 2) Apply your desired filters, 3) Copy the URL from your browser address bar. This is useful when you want to scrape a specific search result page you've already configured on the website.

## `maxItems` (type: `integer`):

Free users: Limited to 100. Paid users: Optional, max 1,000,000. Leave empty for unlimited (paid users only).

## `keyword` (type: `string`):

Search term to find datasets. This will search across dataset titles, descriptions, and keywords. Only works when using search filters (not with startUrl). Example: 'weather', 'population', 'economic'

## `svcType` (type: `string`):

Filter by dataset type. FILE: Downloadable file-based datasets (CSV, XML, Excel, etc.). API: REST API endpoints that return JSON/XML data. STD: Standard datasets following specific data formats. LINKED: Linked data using semantic web standards. Only works when using search filters (not with startUrl).

## `recmSe` (type: `string`):

Filter by recommendation status. Y: Only show datasets recommended by the portal (typically high-quality, frequently used datasets). N: Show all datasets including non-recommended ones. Leave empty to show all datasets. Only works when using search filters (not with startUrl).

## `conditionType` (type: `string`):

Search condition type. 'init': Initial search state (default, shows all results). 'search': Active search mode (typically used after applying filters). Usually leave as 'init' unless you need specific search behavior. Only works when using search filters (not with startUrl).

## `kwrdArray` (type: `string`):

Comma-separated keywords for advanced search. Example: 'temperature,precipitation,humidity'. Only works when using search filters (not with startUrl).

## Actor input object example

```json
{
  "maxItems": 10,
  "keyword": "weather"
}
```

# Actor output Schema

## `datasets` (type: `string`):

Complete dataset with all scraped dataset information including metadata, descriptions, API information, and organization details

## `overview` (type: `string`):

Overview view of datasets with key fields displayed in a table format

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10,
    "keyword": "weather"
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/data-go-kr-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "maxItems": 10,
    "keyword": "weather",
}

# Run the Actor and wait for it to finish
run = client.actor("parseforge/data-go-kr-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10,
  "keyword": "weather"
}' |
apify call parseforge/data-go-kr-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/data-go-kr-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Korea’s Open Government Data Portal (data.go.kr)",
        "description": "Extract comprehensive dataset listings from Korea’s Open Government Data Portal (data.go.kr), including metadata, descriptions, API details, and organization info. Supports filtering by dataset type, organization, keywords, and other parameters, enabling automated access to 50,000+ datasets.",
        "version": "1.0",
        "x-build-id": "3c02Zi3RnONPqoNWh"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~data-go-kr-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-data-go-kr-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~data-go-kr-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-data-go-kr-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~data-go-kr-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-data-go-kr-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrl": {
                        "title": "Start URL",
                        "pattern": "^https://www\\.data\\.go\\.kr/.*",
                        "type": "string",
                        "description": "Direct URL to a search results page from data.go.kr. Use this OR search filters below, not both. To get a URL: 1) Go to https://www.data.go.kr/en/tcs/dss/selectDataSetList.do, 2) Apply your desired filters, 3) Copy the URL from your browser address bar. This is useful when you want to scrape a specific search result page you've already configured on the website."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 100. Paid users: Optional, max 1,000,000. Leave empty for unlimited (paid users only)."
                    },
                    "keyword": {
                        "title": "Search Keyword",
                        "type": "string",
                        "description": "Search term to find datasets. This will search across dataset titles, descriptions, and keywords. Only works when using search filters (not with startUrl). Example: 'weather', 'population', 'economic'"
                    },
                    "svcType": {
                        "title": "Dataset Type",
                        "enum": [
                            "FILE",
                            "API",
                            "STD",
                            "LINKED"
                        ],
                        "type": "string",
                        "description": "Filter by dataset type. FILE: Downloadable file-based datasets (CSV, XML, Excel, etc.). API: REST API endpoints that return JSON/XML data. STD: Standard datasets following specific data formats. LINKED: Linked data using semantic web standards. Only works when using search filters (not with startUrl)."
                    },
                    "recmSe": {
                        "title": "Recommendation Type",
                        "enum": [
                            "N",
                            "Y"
                        ],
                        "type": "string",
                        "description": "Filter by recommendation status. Y: Only show datasets recommended by the portal (typically high-quality, frequently used datasets). N: Show all datasets including non-recommended ones. Leave empty to show all datasets. Only works when using search filters (not with startUrl)."
                    },
                    "conditionType": {
                        "title": "Condition Type",
                        "enum": [
                            "init",
                            "search"
                        ],
                        "type": "string",
                        "description": "Search condition type. 'init': Initial search state (default, shows all results). 'search': Active search mode (typically used after applying filters). Usually leave as 'init' unless you need specific search behavior. Only works when using search filters (not with startUrl)."
                    },
                    "kwrdArray": {
                        "title": "Keyword Array",
                        "type": "string",
                        "description": "Comma-separated keywords for advanced search. Example: 'temperature,precipitation,humidity'. Only works when using search filters (not with startUrl)."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
