Bluesky Starter Pack Scraper avatar

Bluesky Starter Pack Scraper

Pricing

Pay per event

Go to Apify Store
Bluesky Starter Pack Scraper

Bluesky Starter Pack Scraper

Export full member lists from any Bluesky Starter Pack via the AT Protocol API — pack metadata and member profiles with follower counts — to JSON or CSV. No login needed; we page through every member so none are missed.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

15 days ago

Last modified

Categories

Share


🎯 What this scrapes

Bluesky Starter Packs are curated community lists that drove 43% of all follows during Bluesky's 2024 growth surge (EurekAlert, 2024). This Bluesky starter pack scraper reaches the AT Protocol AppView API (https://public.api.bsky.app/xrpc/) to export every member of any starter pack — by URI or by creator handle. We do the heavy lifting against three chained endpoints; you get one clean denormalized dataset.

Target endpoints:

  • app.bsky.graph.getStarterPack — single pack metadata and embedded list URI
  • app.bsky.graph.getActorStarterPacks — all packs published by a creator, with cursor pagination
  • app.bsky.graph.getList — member profiles with follower and post counts

🔥 Features

  • Two collection modes — scrape a single pack by URI/URL, or bulk-export every pack owned by a creator handle up to 100 packs per run.
  • Denormalized output — every row carries pack metadata and the member profile in one flat record. No joins needed when you drop it into a spreadsheet or CRM.
  • Bluesky web URLs accepted — paste https://bsky.app/starter-pack/handle/rkey directly; we normalize it to AT URI form.
  • Follower counts on every rowmember_followers_count, member_following_count, and member_posts_count are denormalized onto each member row.
  • Configurable member cap — 1–5,000 members per pack; keep test runs cheap.
  • PPE pricing — you pay only for rows emitted, not for idle compute.

💡 Use cases

  • B2B growth marketing — identify niche audiences (e.g. "AI researchers on Bluesky") and build targeted outreach lists from curated community packs.
  • Bluesky audience research — map which accounts are recommended by influential pack curators in your industry before you invest in the platform.
  • Bluesky followers list export — enumerate the full member graph of a peer group and compare follower counts, post activity, and engagement ratios.
  • Academic research — map community topology and influencer networks; starter packs are the AT Protocol's primary community-formation mechanism.
  • Competitive intelligence — track which handles appear across multiple packs in your niche.
  • Journalism and OSINT — document community formation around a topic or event.

⚙️ How to use it

Single-pack mode — paste the pack URI or web URL:

{
"starterPackUri": "at://did:plc:z72i7hdynmk6r22z3wouymf/app.bsky.graph.starterpack/ohX7HZkOlFj",
"maxMembersPerPack": 500,
"proxyConfiguration": {"useApifyProxy": true}
}

Or use a Bluesky web URL directly:

{
"starterPackUri": "https://bsky.app/starter-pack/alice.bsky.social/abc123rkey",
"maxMembersPerPack": 500,
"proxyConfiguration": {"useApifyProxy": true}
}

Creator mode — scrape every pack owned by a Bluesky handle:

{
"creatorHandle": "pfrazee.com",
"maxPacks": 10,
"maxMembersPerPack": 200,
"proxyConfiguration": {"useApifyProxy": true}
}

Exactly one of starterPackUri or creatorHandle must be set — providing both or neither returns a validation error before any network call is made.

📥 Input

FieldTypeRequiredDefaultDescription
starterPackUristringone-ofAT URI (at://...) or Bluesky web URL (https://bsky.app/starter-pack/...) of a single pack to scrape. Mutually exclusive with creatorHandle.
creatorHandlestringone-ofBluesky handle (e.g. pfrazee.com) or DID of a creator. Exports every pack owned by that user. Mutually exclusive with starterPackUri.
maxPacksintegerno10Max packs to process in creator mode (1–100). Ignored in single-pack mode.
maxMembersPerPackintegerno500Max member rows emitted per pack (1–5,000). Pagination stops when the limit is reached.
proxyConfigurationobjectnoApify ProxyApify Proxy configuration. We route requests through Apify Proxy to sustain throughput and rotate exit IPs when the API rate-limits us.

📤 Output

One row per pack member. Pack metadata is denormalized into every row so a single CSV or JSON export is self-contained.

FieldTypeNullableDescription
pack_uristringnoAT URI of the starter pack
pack_namestringnoDisplay title of the pack
pack_descriptionstringyesPack description, if set
pack_creator_handlestringnoBluesky handle of the pack creator
member_didstringnoDecentralized identifier of the member
member_handlestringnoBluesky handle of the member
member_display_namestringyesDisplay name (may be blank)
member_followers_countintegeryesFollower count
member_following_countintegeryesFollowing count
member_posts_countintegeryesPost count
member_indexed_atstringyesISO 8601 datetime the member was indexed
scraped_atstringnoISO 8601 UTC datetime this row was written

Example record:

{
"pack_uri": "at://did:plc:abc123/app.bsky.graph.starterpack/xyz789",
"pack_name": "AI Researchers on Bluesky",
"pack_description": "Curated list of ML/AI researchers who migrated from Twitter.",
"pack_creator_handle": "alice.bsky.social",
"member_did": "did:plc:def456",
"member_handle": "bob.bsky.social",
"member_display_name": "Bob Smith",
"member_followers_count": 1204,
"member_following_count": 380,
"member_posts_count": 841,
"member_indexed_at": "2024-11-14T09:22:01.000Z",
"scraped_at": "2026-05-16T12:00:00.000Z"
}

💰 Pricing

This Actor uses Pay-Per-Event (PPE) pricing — you pay only for what you use, not for idle compute time.

EventPriceWhen
Actor start$0.05Once per run at boot
Member row emitted$0.002Per row written to the dataset

Effective cost: ~$2.05 per 1,000 members scraped ($0.05 start + $0.002 × 1,000 rows).

At 5,000 members (maximum per pack): ~$10.05.

🚧 What we handle for you

The AT Protocol AppView is a live, rate-limited, paginated API — not a static file. Here is what this Actor absorbs so you don't have to:

  • Fingerprint rotation — we cycle browser TLS profiles (Chrome/Firefox/Safari impersonation via curl-cffi) so every request presents a real-browser handshake, not a plain Python script.
  • Proxy rotation — we route through Apify Proxy and request a fresh session on every block or rate-limit response, rotating the exit IP automatically.
  • Retries with exponential backoff — on 408 / 429 / 503 and transient network errors we back off (2 s → 4 s → 8 s → 16 s → 30 s cap, 5 attempts) and honour Retry-After headers when present.
  • Rate-limit pacing — cursor pagination across getActorStarterPacks + getList is throttled to stay within the API's sustained throughput; we never hammer it.
  • Clean typed rows — every row is Pydantic-validated before it lands in the dataset. ISO 8601 timestamps, stable DID fields, null-safe nullable columns. Dirty rows never reach your export.
  • Pay-per-result — the actor-start warmup fee is $0.05; beyond that you pay only for rows that successfully land.

❓ FAQ

Do I need a Bluesky account to use this? No. The AT Protocol public AppView API (https://public.api.bsky.app/xrpc/) supports unauthenticated access. No login, no API key, no OAuth required from you — we handle authentication with the platform on our end.

Can I scrape multiple packs in one run? Yes — use creator mode (creatorHandle) to export every pack published by a Bluesky user, up to 100 packs per run.

How do I find a pack's AT URI? Paste the Bluesky web URL (https://bsky.app/starter-pack/<handle>/<rkey>) directly into starterPackUri. We normalize it to AT URI form automatically.

Why are some follower counts null? The AT Protocol API returns profile viewer stats inline in list member objects. If a profile is new or those fields are absent from the API response, we emit the row with those fields set to null rather than dropping the row — so your row count stays accurate.

Is this a bluesky followers list for any user? No — this Actor scopes to starter pack membership. It exports who is in a pack, not the full follower graph of an arbitrary user. For Bluesky feed and post data, see our companion Bluesky Feed Posts Actor.

Is scraping Bluesky starter packs compliant with the Terms of Service? Yes. The AT Protocol is an open protocol explicitly designed for interoperability and data portability. Bluesky has publicly proposed a scraping standard for AI training datasets (Slashdot, 2025). The public AppView API is the official mechanism for unauthenticated access to pack data.

What happens if I provide both starterPackUri and creatorHandle? The Actor raises a validation error before making any network call and exits with a non-zero status and a clear error message.

Can I use this for bluesky audience research before investing in the platform? Yes — that is a core use case. Export a pack curated by someone active in your niche, analyse follower counts and post cadence, and decide whether the audience is worth engaging before you build a presence.

💬 Your feedback

Found a bug, a broken pack URI, or a missing field? Open an issue or contact DevilScrapes at https://apify.com/DevilScrapes. Feature requests and star ratings on the Apify Store help us prioritise the roadmap.