👾 Lemmy Scraper - Federated Reddit Posts & Comments avatar

👾 Lemmy Scraper - Federated Reddit Posts & Comments

Pricing

Pay per usage

Go to Apify Store
👾 Lemmy Scraper - Federated Reddit Posts & Comments

👾 Lemmy Scraper - Federated Reddit Posts & Comments

Scrape Lemmy (the federated Reddit alternative) from any instance via the public API — no login needed. Get front-page or per-community posts, comments, keyword search, and community data. Clean JSON with scores, upvotes & comment counts.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

ben

ben

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

👾 Lemmy Scraper — Posts, Comments & Communities (Federated Reddit)

Extract Lemmy data — the open, federated Reddit alternative — from any instance (lemmy.world, lemmy.ml, sh.itjust.works, beehaw.org and more) through the public API. Pull front-page or per-community posts, comments, keyword search results, or community data as clean, structured JSON with Reddit-style scores, upvotes/downvotes and comment counts — no login required. Perfect for communities that left Reddit and for open-social research. Export to JSON/CSV/Excel, run on a schedule, call via API, or connect to Make, Zapier or n8n.

👾 What is the Lemmy Scraper?

It turns any Lemmy instance into a structured dataset. Point it at a server, pick a mode — front-page posts, posts from specific communities, comments, a keyword search, or a list of communities — set a sort order, and it returns every matching record straight from Lemmy's public REST API. Query the whole federated network or just one instance, and reach cross-instance communities like technology@lemmy.world. It reads a clean JSON API instead of a headless browser, so it's fast and cheap.

What data does it extract?

  • Post title, body and the link URL it points to
  • Reddit-style metrics — score, upvotes, downvotes and comment count
  • Community info — name, title and community URL
  • Creator (author) name and profile URL
  • Publish date, NSFW flag and thumbnail image
  • Comments — content, score, parent post title and author (comments mode)
  • Community listings — description, subscribers, post/comment totals and monthly active users
  • Canonical post/comment URLs (ap_id), plus a scraped_at timestamp

⬇️ Input

Choose an instance and a mode, then add communities, a query or a sort as needed:

FieldDescription
modeposts, community, search, comments or communities
instanceLemmy server to query, e.g. lemmy.world, lemmy.ml, beehaw.org
communitiesCommunity names, e.g. technology, asklemmy, or cross-instance technology@lemmy.world
queryKeyword (search mode = posts; communities mode = community names)
sortHot, Active, New, TopDay/Week/Month/Year/All, MostComments
listingTypeAll (whole federated network) or Local (this instance only)
maxItemsMax records to return (1–50000)
proxyConfigurationOptional Apify Proxy for IP rotation on large runs

Example input

{
"mode": "community",
"instance": "lemmy.world",
"communities": ["technology", "asklemmy"],
"sort": "TopWeek",
"maxItems": 500
}

⬆️ Output

Every post (or comment/community) is one clean row — view it as a table, or export JSON / CSV / Excel:

{
"type": "post",
"id": 48685969,
"title": "Self-hosting is easier than ever in 2026",
"body": "Here's my setup...",
"link_url": "https://example.com/article",
"post_url": "https://lemmy.world/post/48685969",
"published": "2026-06-26T09:00:00Z",
"nsfw": false,
"score": 842,
"upvotes": 901,
"downvotes": 59,
"comments_count": 137,
"community_name": "technology",
"community_title": "Technology",
"community_url": "https://lemmy.world/c/technology",
"creator_name": "alice",
"creator_url": "https://lemmy.world/u/alice",
"thumbnail_url": "https://lemmy.world/pictrs/image/abc.jpg",
"scraped_at": "2026-06-26T15:30:00.000Z"
}

💡 Use cases

  • 👂 Community & topic monitoring: track discussions about a product, brand or topic across the fediverse.
  • 🔄 Reddit-migration research: follow the communities and audiences that moved off Reddit.
  • 📈 Trend & sentiment analysis: feed posts and comments straight into an LLM.
  • 🔥 Content discovery: surface the top posts by community and time window with one sort setting.

❓ FAQ

How do I scrape Lemmy posts? Set mode: posts for the front page, or mode: community with one or more communities, choose an instance and a sort, and Run. You get every post with title, body, link, scores and comment counts.

Do I need an API key or login? No — public posts, comments and communities all work with no login, straight from Lemmy's public REST API.

Does it work on any instance, and is it federated? Yes — point instance at any Lemmy server. With listingType: All it sees most of the whole federated network; with Local it stays on that one instance. lemmy.world is the largest starting point.

Can I scrape a community on another instance? Yes — use community@instance (e.g. technology@lemmy.world), since Lemmy is federated and resolves it for you.

Can I get comments, not just posts? Yes — mode: comments returns comments per community (via communities) or instance-wide, with content, score, the parent post title and the author.

How do I find communities to scrape? Use mode: communities with a query to search community names, or leave the query empty to list the instance's top communities with subscriber and activity counts.

How many records can it return? Up to your maxItems cap (up to 50,000); it paginates automatically and, in community/comments modes, splits the cap across the communities you give it.

Can I run it on a schedule or via API? Yes — schedule recurring runs in Apify, call it via the API/SDK, or connect it to Make, Zapier or n8n.

Is scraping Lemmy legal? It reads publicly available data via Lemmy's own public API. Use it responsibly for research and monitoring, and follow applicable laws and each instance's terms.

🔗 You might also like


Keywords: Lemmy scraper, Lemmy API, fediverse scraper, federated Reddit, Reddit alternative scraper, Lemmy posts, Lemmy comments, Lemmy communities, ActivityPub, lemmy.world scraper, social media scraper, social listening, sentiment analysis, open social data, Reddit migration.