Weibo Scraper - Chinese Social Intelligence
Pricing
from $5.00 / 1,000 item scrapeds
Weibo Scraper - Chinese Social Intelligence
Extract Chinese public opinion, trending topics, brand sentiment, and creator data from Weibo (微博) — China's largest microblog with 580M+ users. Built for AI training corpora, Chinese equity research, and brand monitoring. No login, no browser. Part of the Chinese Digital Intelligence Suite.
Pricing
from $5.00 / 1,000 item scrapeds
Rating
1.0
(1)
Developer
Sami
Maintained by CommunityActor stats
2
Bookmarked
167
Total users
89
Monthly active users
5 days ago
Last modified
Categories
Share
Extract Chinese public opinion, trending topics, and real-time consumer sentiment from Weibo (微博) — China's dominant microblog with 580M+ monthly users producing the densest public-opinion signal in China. Built for AI training corpora, Chinese consumer equity research alt-data, brand monitoring agencies, and academic NLP teams. No login, no API key, no VPN. The only quality Weibo scraper on Apify.
How to scrape Weibo in 3 easy steps
- Go to the Weibo Scraper page on Apify and click "Try for free"
- Configure your input — choose a mode (
hot_search,hot_search_delta,post_comments,search, oruser_posts), enter your keywords or post IDs, and set the number of results - Click "Run", wait for the scraper to finish, then download your data in JSON, CSV, or Excel format
No coding required. No API key. Works with Apify's free plan.
🏢 Sourcing a Chinese-language LLM training corpus — or running Weibo at production scale?
This Actor pulls Weibo at corpus scale: hundreds of thousands to millions of clean, structured posts, on a schedule — drop-in for AI-training pipelines, quant alt-data signals, and brand-intelligence warehouses. Pay-per-result, no contract.
For high-volume / enterprise I offer bulk & volume pricing, custom output schemas matched to your data warehouse, dedicated proxy throughput for sustained million-row pulls, scheduled managed feeds, and a schema-stability SLA (no breaking changes without 30-day notice).
→ DM me on Apify, open an Issue titled "Enterprise inquiry", or email samimassis2002@gmail.com (subject "Weibo enterprise").
Part of the Chinese Digital Intelligence Suite
The only Apify developer specializing in Chinese-platform intelligence — built specifically for AI training data buyers, equity research analysts covering Chinese consumer stocks, and brand monitoring teams:
- 🆕 Chinese Brand Monitor — Cross-platform brand mention aggregator (Weibo + RedNote + Bilibili + Douban + Xueqiu in one normalized feed, sentiment-tagged, cross-platform deduped — $0.045/mention)
- Weibo Scraper — You are here (microblogging, hot search, real-time public opinion)
- Bilibili Scraper — China's video platform: danmaku, comments, Gen-Z creator sentiment
- RedNote (Xiaohongshu) Scraper — China's Instagram + Pinterest (lifestyle, consumer reviews)
- RedNote Shop Scraper — RedShop e-commerce (products, vendors, prices)
- Douban Scraper — Long-form reviews (movies/books/music), group discussions
- Xueqiu Scraper — Chinese stock-discussion sentiment, cashtag indexing
Together, these cover the five pillars of Chinese consumer signal: microblog opinion, video sentiment, lifestyle reviews, e-commerce, and long-form opinion. Most analysts buy 2-4 of these for cross-platform coverage. Building a cross-platform brand monitoring pipeline? The Chinese Brand Monitor aggregator gives you all 5 platforms in one normalized output — saves 4-6 hours of engineering vs. orchestrating individual scrapers.
Who buys this scraper
| Buyer profile | Use case | Typical spend |
|---|---|---|
| AI / LLM training data teams | Real-time Chinese microblog text for SFT corpora + current-events grounding | $200-1,500/mo |
| Hedge fund / equity research desks | Brand mention velocity, hot-search momentum as alt-data on Chinese consumer stocks (POP MART, BYD, Anta, Yum China, BeiGene) | $100-500/mo |
| Brand monitoring agencies | Real-time tracking of Western brand mentions, crisis detection on China's public square | $200-800/mo |
| Geopolitical / policy analysts | Monitor Chinese public discourse, narrative tracking, policy response signal | $150-600/mo |
| Academic NLP / sentiment researchers | Chinese microblog corpus, labeled sentiment data for classifier training | $50-200/mo |
| Journalists / investigative teams | Source Chinese public opinion data for reporting on consumer brands, viral events | $50-150/mo |
What is Weibo?
Weibo (微博) is China's dominant microblogging platform — think Twitter meets Instagram. With 580M+ monthly active users, it's where Chinese public opinion forms, brands communicate, and news breaks. Government officials, celebrities, and brands all maintain active Weibo accounts. For data buyers, Weibo's hot search ranking is the closest thing China has to a real-time barometer of public attention — a leading indicator that precedes earnings-call surprises and brand events by 1-4 weeks.
Weibo API alternative
There is no official public Weibo API available for international developers. Weibo's developer API requires a Chinese business license, has severe rate limits, and returns limited data. This Weibo Scraper is the best Weibo API alternative in 2026 — it extracts trending topics, posts, comments, and user profiles without any official API access. No Chinese business registration needed.
Use Cases by buyer
| Who | Why they use it |
|---|---|
| AI / LLM training data teams | Real-time Chinese-language microblog text for SFT, RLHF training, and current-events grounding for Chinese LLMs |
| Equity research / hedge funds | Hot-search velocity + brand mention spikes as alt-data leading indicator on Chinese consumer stocks (3-50× cheaper than Bloomberg Chinese consumer feeds) |
| Brand monitoring teams | Real-time tracking of brand mentions, viral content, and crisis detection on China's public square |
| Geopolitical / policy analysts | Monitor public discourse on policy, international topics, and narrative trends |
| PR & communications | Track brand mentions and sentiment shifts in real time |
| Competitive intelligence | Track Chinese competitor announcements, product launches, and audience reception |
| Influencer marketing | Find and evaluate Weibo KOLs by followers, engagement, verification status |
| Journalism | Access Chinese public opinion data for investigative reporting |
| Academic research | Pre-built Chinese microblog corpus with engagement metrics for NLP and sociology studies |
Scrape Weibo with Python, JavaScript, or no code
You can use the Weibo Scraper directly from the Apify Console (no code), or integrate it into your own scripts with Python or JavaScript.
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("zhorex/weibo-scraper").call(run_input={"mode": "hot_search","maxResults": 50})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('zhorex/weibo-scraper').call({mode: 'hot_search',maxResults: 50,});const { items } = await client.dataset(run.defaultDatasetId).listItems();items.forEach((item) => console.log(item));
Using the raw REST API (Postman / curl)
⚠️ The run endpoint is asynchronous — its response is the run object (IDs + status), NOT your scraped data. If you
POSTto/acts/.../runsyou get back something like{ "data": { "status": "READY", "defaultDatasetId": "…" } }with no results in it — that's expected, the run hasn't finished yet. The records land in the run's dataset, not in that response. (ThecontainerUrllink is the live container; once a run finishes it just shows "run has already finished with status SUCCEEDED" — that means success, it is not where the data lives.)
Easiest — one call that waits for the run and returns the records directly:
curl -X POST "https://api.apify.com/v2/acts/zhorex~weibo-scraper/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"mode":"hot_search","maxResults":50}'
The response body is the JSON array of records — no second call needed.
Or async — start the run, then fetch the dataset once it finishes:
# 1) start the run — note the "defaultDatasetId" in the responsecurl -X POST "https://api.apify.com/v2/acts/zhorex~weibo-scraper/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" -d '{"mode":"hot_search","maxResults":50}'# 2) when the run status is SUCCEEDED, fetch the records from its datasetcurl "https://api.apify.com/v2/datasets/DEFAULT_DATASET_ID/items?token=YOUR_API_TOKEN"
💡 In the Apify Console you can also open any run and click the Output / Storage → Dataset tab to view and download the same data as JSON / CSV / Excel.
Features
| Mode | What it does | Cookies needed? |
|---|---|---|
| Search Posts | Find posts by keyword — returns query-relevant results | No |
| Hot Search / Trending | Real-time trending topics with heat scores and categories | No |
| Hot Search Delta 🆕 | Scheduled trend monitor — what's new / rising / falling / dropped vs the last run (rank velocity, time-on-board, peaks) | No |
| Post Comments | Comments + post detail with engagement metrics | No |
| User Posts | User profile + posts from specific accounts | For posts only |
- No browser needed — Pure HTTP, runs in 256MB RAM
- No VPN needed — Globally accessible endpoints
- Automatic session — Visitor cookies obtained automatically
- Rate-limit handling — Exponential backoff on 418/429 errors
- 🆕 Sentiment scoring (optional) — set
sentimentAnalysis: trueto tag every post & comment with Chinese sentiment: polarity (positive/neutral/negative) + a −1.0…+1.0 score (SnowNLP model for Chinese text, keyword fallback for English). Built for brand-sentiment tracking & alt-data pipelines. Loads a model, so run with memory ≥512 MB when enabled. - 🆕 Auto-localize brand search — search a Latin brand name (e.g.
Nike) and the Actor automatically also searches its Chinese name (耐克), then merges and dedupes the results — full native recall even when you search in English, capped atmaxResults(no extra cost). On by default; add custom variants viasearchAliases.
How to Use
1. Trending Topics (no cookies needed)
Get the current Weibo hot search — the real-time pulse of Chinese internet.
{"mode": "hot_search","maxResults": 50}
2. Post Comments (no cookies needed)
Extract comments from specific posts. Provide post IDs (mid) or detail URLs.
{"mode": "post_comments","postIds": ["5285773987283226"],"maxComments": 50}
3. Search Posts
Search by keyword in Chinese or English. Returns query-relevant results — no cookies needed.
💡 Brand/topic monitoring —
autoLocalizedoes the Chinese↔Latin step for you. Weibo indexes by Chinese text, so耐克returns more posts thanNike. WithautoLocalizeon (the default), searching a common Latin brand name automatically also searches its Chinese name and merges the results — full native recall without thinking about it, capped atmaxResultsso it costs no extra. For brands outside the built-in dictionary, add the Chinese term yourself viasearchAliases. (Few results for an English keyword usually means the language, not an error.)
{"mode": "search","searchQuery": "人工智能","maxResults": 50}
Brand search with auto-localize — searches Nike and 耐克 (automatic), plus any aliases you add, merged and deduped and capped at maxResults:
{"mode": "search","searchQuery": "Nike","searchAliases": ["AJ", "Air Jordan"],"maxResults": 100}
4. User Posts
Get profile info (always works) + posts (requires cookies). Provide numeric user IDs or profile URLs.
{"mode": "user_posts","userIds": ["1642634100"],"maxResults": 50,"cookieString": "SUB=your_sub_cookie_value"}
5. Hot Search Delta — scheduled trend monitor (no cookies needed)
Run this on a schedule (hourly or daily) and each run reports what changed on the trending board since the previous run, instead of a flat snapshot. Every topic is tagged new, rising, falling, steady, or dropped, with rank movement, hot-value change, how long it has been trending, and its running peak.
{"mode": "hot_search_delta","deltaStateKey": "default"}
State persists across runs in a named store, so the first run sets a baseline and every run after it shows the deltas. Use different deltaStateKey values to track independent streams (e.g. hourly vs daily).
⏰ Set up daily monitoring in 2 minutes
Most of this Actor's value is in recurring runs. A single pull is a one-off snapshot — but a daily or hourly schedule turns it into a continuously-updated Chinese brand / public-opinion feed. That's where pay-per-result compounds: instead of paying once for a static dump, you build a living dataset that tracks how the conversation moves week over week.
- Run the Actor once with your input — a brand keyword in
searchmode, orhot_searchto capture the trending board — and check the output looks right. - Apify Console → Schedules → Create → pick this Actor and your saved input. (Even faster: open any finished run and click Schedule to reuse its exact input.)
- Set a cron expression and save — e.g.
0 8 * * *= daily at 8am, or0 * * * *= hourly. While you're there, enable the email notification on failed runs so you hear about a hiccup without checking manually.
Each scheduled run appends fresh results to the same dataset, so you accumulate a continuously-updated history with zero manual work — perfect for sentiment trend lines, brand-mention velocity, and time-series alt-data.
🔁 Recommended recurring mode:
hot_search_deltais purpose-built for scheduled trend-velocity tracking — each run reports what's new / rising / falling versus the last run instead of a flat snapshot. If you're going to run on a schedule, this is the mode to point it at.
🧠 Need this at AI-training-corpus scale?
If you're pulling Weibo's short-form posts, trending-topic chatter, and comment threads to train or fine-tune language models, the Chinese AI Training Corpus Engine assembles all 5 Chinese platforms — Weibo, RedNote, Bilibili, Douban, and Xueqiu — into AI-ready documents in one run: deduplicated, quality-scored, PII-scrubbed, and provenance-stamped for EU AI Act documentation, from $0.025/doc, with rejects and duplicates never charged.
How to Get Cookies (for User Posts)
User posts mode returns profiles without cookies. To also get a user's actual posts, provide a login cookie:
- Open weibo.com in your browser and log in
- Open DevTools (F12) → Application → Cookies →
weibo.com - Copy the value of the SUB cookie
- Paste it in the
cookieStringfield as:SUB=your_value_here
The cookie typically lasts several days before expiring.
Output Examples
Trending Topic
{"rank": 1,"title": "人工智能最新突破","category": "科技","hotValue": 2847562,"labelName": "热","isHot": true,"url": "https://s.weibo.com/weibo?q=...","scrapedAt": "2026-04-10T12:00:00Z"}
Hot Search Delta record
Each record shows how a topic moved since the previous run (status ∈ new / rising / falling / steady / dropped):
{"rank": 3,"title": "某品牌新品发布","category": "科技","hotValue": 1820000,"status": "rising","rankDelta": 5,"hotValueDelta": 640000,"previousRank": 8,"firstSeenAt": "2026-06-04T08:00:00+00:00","minutesOnBoard": 120,"peakRank": 3,"peakHotValue": 1820000,"isHot": true,"isNew": false,"url": "https://s.weibo.com/weibo?q=...","snapshotAt": "2026-06-04T10:00:00+00:00"}
Post
{"postId": "5285773987283226","text": "介绍一下我的老婆!@金莎","createdAt": "Wed Apr 09 12:49:23 +0800 2026","repostsCount": 493,"commentsCount": 4549,"attitudesCount": 97438,"authorName": "孙丞潇","authorId": "7511222755","authorFollowers": 0,"authorVerified": false,"images": ["https://wx1.sinaimg.cn/large/..."],"videoUrl": "","isRepost": false,"postUrl": "https://weibo.com/7511222755/5285773987283226","scrapedAt": "2026-04-10T12:00:00Z"}
Comment
{"commentId": "5285813927600208","text": "恭喜恭喜!神仙眷侣,一定要狠狠幸福哦~","createdAt": "Thu Apr 09 12:51:31 +0800 2026","likeCount": 1268,"authorName": "吃瓜罗伯特","authorId": "6108685154","postId": "5285773987283226","postUrl": "https://weibo.com/detail/5285773987283226","scrapedAt": "2026-04-10T12:00:00Z"}
Sentiment field (optional)
With sentimentAnalysis: true, every post and comment gains a sentiment object:
{"polarity": "positive","score": 0.42,"method": "snownlp"}
polarity ∈ positive / neutral / negative · score ∈ −1.0…+1.0 (higher = more positive) · method is snownlp for Chinese text, or keyword for the English-only fallback.
User Profile
{"userId": "1642634100","screenName": "新浪科技","description": "新浪科技是中国最有影响力的TMT产业资讯及数码产品服务平台","followersCount": 23785876,"friendsCount": 3875,"statusesCount": 213546,"verified": true,"verifiedReason": "新浪网技术(中国)有限公司官方微博","profileUrl": "https://weibo.com/u/1642634100","scrapedAt": "2026-04-10T12:00:00Z"}
Content is in Chinese
All content is returned in the original Simplified Chinese. Weibo is a Chinese-language platform — posts, comments, trending topics, and user bios are in Chinese.
If you need English translations, pipe the output through a translation API (Google Translate, DeepL, or Claude).
Technical Details
- No browser: pure HTTP — fast and lightweight, runs in 256MB RAM
- No authentication required: works against publicly accessible content only
- Built-in rate limiting: automatic retry with exponential backoff to handle peak-hour throttling
- Globally accessible: no VPN or proxy required
- Clean structured JSON output: ready for analysis or downstream pipelines
Pricing
$20 per 1,000 results (pay-per-event)
Each scraped item (post, comment, trending topic, or profile) counts as one result.
Typical costs (small-scale):
- Top 50 trending topics snapshot: ~$1.00
- 100 posts on a brand keyword: ~$2.00
- 200 comments on a viral post: ~$4.00
- User profile + 50 posts: ~$1.02
B2B / bulk-scale examples:
- AI training corpus seed (10,000 posts on a topic): ~$200
- Daily brand sentiment monitor (500 posts/day for a month): ~$300/month
- Equity research signal (10 tickers × 200 posts daily): ~$1,200/month
- Multi-source academic dataset (50,000 posts across 30 keywords): ~$1,000
Volume pricing available above 50K items/month (see Enterprise section above).
Platform compute costs (Apify usage) are charged separately.
Limitations
- User posts mode returns profile data without authentication. Full post history may be limited for some accounts
- Search, hot search, and post comments work fully without authentication
- Only public data is accessible — private/locked accounts are not available
- Weibo may rate-limit requests during peak hours — handled automatically with backoff
- Very old posts may not be available
FAQ
Is there a Weibo API?
There is no official public Weibo API available for international developers. Weibo's developer platform requires a Chinese business license and imposes strict rate limits. This Weibo Scraper is the best alternative — extract trending topics, posts, comments, and profiles without any official API access.
How much does it cost to scrape Weibo?
The Weibo Scraper costs $20 per 1,000 results (pay-per-event). Each scraped item (post, comment, trending topic, or profile) counts as one result. You can start with Apify's free plan, which includes $5 of monthly credits — enough for 250 data points.
Can I scrape Weibo in Python?
Yes. Install the Apify Python client (pip install apify-client), then use the ApifyClient to call the zhorex/weibo-scraper actor. See the Python code example above.
Is scraping Weibo legal?
This scraper only accesses publicly available data through Weibo's public web endpoints. It does not bypass authentication or access private/locked accounts. Always review your local laws and Weibo's terms of service before scraping.
What is the best Weibo scraper in 2026?
The Weibo Scraper by Zhorex is the only quality Weibo scraper on Apify in 2026. It supports 5 modes (hot search, hot-search delta, post comments, search, and user posts), handles rate limits automatically, and runs without a browser or VPN.
Integrations & data export
The Weibo Scraper integrates with your existing workflow tools:
- Google Sheets — Send scraped Weibo data directly to a spreadsheet
- Zapier / Make / n8n — Automate workflows triggered by new Weibo data
- REST API — Call the actor programmatically and retrieve results via Apify's REST API
- Webhooks — Get notified when a scraping run finishes and process data in real time
- Data formats — Download results in JSON, CSV, Excel, XML, or RSS
More scrapers by Zhorex
Chinese Digital Intelligence Suite
- 🆕 Chinese AI Training Corpus Engine — Weibo + RedNote + Bilibili + Douban + Xueqiu into AI-training-ready documents (MinHash dedup, quality scoring, PII scrub, EU AI Act provenance)
- 🆕 Chinese Brand Monitor — Cross-platform brand mention aggregator (Weibo + RedNote + Bilibili + Douban + Xueqiu, sentiment + dedup)
- Bilibili Scraper — China's video platform: danmaku, comments, Gen-Z creator analytics
- RedNote (Xiaohongshu) Scraper — China's Instagram + Pinterest (lifestyle, consumer reviews)
- RedNote Shop Scraper — RedShop e-commerce (products, vendors, prices)
- Douban Scraper — Long-form reviews, ratings, group discussions (movies/books/music)
- Xueqiu Scraper — Chinese stock-discussion sentiment, cashtag indexing (SH/SZ/HK/US-listed Chinese)
Streaming & Video
- Twitch Streamer & Channel Analytics — Twitch profiles, live streams, clips, and VODs
- Kick.com Streamer & Channel Analytics — Kick.com profiles, live streams, clips, and categories
- YouTube Shorts Scraper Pro — YouTube Shorts videos, creators, trends
- Letterboxd Scraper — Western film reviews and ratings
Markets & Alt-Data
- TradingView Multi-Market Scraper — Stocks, crypto, forex, indices
- Hyperliquid Pro Scraper — DeFi top traders, vaults, perpetual markets
- Booking.com Reviews Scraper — Hotel reviews and ratings
B2B Reviews
- G2 Reviews Scraper — B2B software reviews and ratings
- Capterra Reviews Scraper — Software product reviews and ratings
Other Tools
- Perplexity AI Scraper — AI-powered search results
- Tech Stack Detector — Detect technologies used by websites
- Telegram Channel Scraper — Public Telegram channel messages
- Phone Number Validator — Validate and format phone numbers
- Sneaker Price Tracker — Track sneaker prices across platforms
Support
Having issues? Open an issue on the Actor page — typically fixed within 48 hours.
Your Review Matters ⭐
Maintaining a working Weibo scraper is real, ongoing effort — most Weibo scrapers on Apify are broken or abandoned. If this one delivered the data you needed, a 30-second review makes a real difference:
- Go to the Weibo Scraper page
- Click the star rating near the top
- Optionally leave a one-line note about your use case (e.g. "pulled 5,000 posts for brand sentiment in minutes")
Why it matters: reviews are the #1 signal Apify users check before trying a scraper — a higher rating means more teams find this Actor instead of abandoned alternatives, which funds faster updates and new features for everyone.
Found a bug or missing field? Open an issue — typically fixed within 48 hours.
Last updated: June 2026 · Actively maintained · Trusted by AI training data teams, equity research desks, brand monitoring agencies, and academic NLP researchers.