Request a tool
All toolsMCP serverRequest a toolPlatformsCategories
Bluesky Scraper icon

Bluesky Scraper

Scrape Bluesky posts by keyword or handle. Get post text, URLs, likes, reposts, and full profiles as clean JSON. No login or API key needed.

Run this in the cloudRun on Apify →

Social Media Scrapers

How it works

  1. 1
    Open it on Apify

    Hit Run on Apify — it opens the tool in the cloud, no install.

  2. 2
    Set the inputs

    Adjust searchQuery, authorHandles, maxItems (sensible defaults are pre-filled).

  3. 3
    Click Run

    The tool runs on Apify’s cloud and collects the data for you.

  4. 4
    Export the results

    Download as JSON, CSV or Excel, or pipe straight into your app, Google Sheets, or an AI agent.

Inputs

FieldWhat it doesType
searchQueryKeyword(s) to search Bluesky posts for (e.g. "artificial intelligence"). Leave empty if you instead want to scrape specific authors via Author handles.string
authorHandlesBluesky handles to scrape (e.g. bsky.app, jay.bsky.team). For each handle the actor returns the author's profile plus their recent posts. The leading @ is optioarray
maxItemsMaximum number of posts to return per search query or per author handle. Pagination follows the API cursor until this limit is reached.integer
notionConnectorOptional. Write each post as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectors,string
notionParentIdOptional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your wostring

What you get

A structured dataset — each result includes fields like:

authorHandlesdetailssearchQuery

Export every run as JSON, CSV or Excel, or send it to your app, a database, Google Sheets, or an AI agent.

3 ready-to-run use cases

Bluesky Brand Monitoring: Track Mentions & Engagement

Social teams can see who's posting about a brand on Bluesky, with each post's author, like count, and repost count for real-time mention tracking.

Scrape Multiple Bluesky Accounts: Posts + Profiles

Feed a list of Bluesky handles and export every account's recent posts and profile into one dataset, ready for a competitor or creator roundup.

Bluesky Dataset by Keyword for Sentiment & NLP

Researchers collect thousands of keyword-matched Bluesky posts as a clean dataset for sentiment analysis, text labeling, and NLP model training.

Bluesky Scraper

Scrape Bluesky through its public AT Protocol XRPC API — no login, no API key, no anti-bot. Two modes:

  • Search — pass a searchQuery keyword and get matching public posts.
  • Authors — pass authorHandles (e.g. bsky.app) and get each author's profile (followers, bio, post count) plus their recent posts.

Input

FieldTypeDescription
searchQuerystringKeyword(s) to search posts for. Use this or authorHandles.
authorHandlesarray of stringsHandles to scrape (the leading @ is optional). For each, returns the profile + recent posts.
maxItemsintegerMax posts per query / per author (default 100). Follows the API cursor until reached.
proxyConfigurationobjectOptional. The Bluesky public API has no anti-bot and needs no proxy, so this is off by default. Only enable it if you hit IP rate limits.

Provide at least one of searchQuery or authorHandles.

Output

Each post row:

{
  "ok": true,
  "type": "post",
  "uri": "at://did:plc:.../app.bsky.feed.post/3kxyz...",
  "postUrl": "https://bsky.app/profile/bsky.app/post/3kxyz...",
  "authorHandle": "bsky.app",
  "authorName": "Bluesky",
  "authorDid": "did:plc:...",
  "text": "…",
  "createdAt": "2024-01-01T00:00:00.000Z",
  "likeCount": 0,
  "repostCount": 0,
  "replyCount": 0,
  "quoteCount": 0,
  "langs": ["en"]
}

In author mode a profile row (type: "profile") is also emitted per handle, with did, handle, displayName, description, followersCount, followsCount, postsCount, avatar, banner, createdAt, and profileUrl.

Posts are deduplicated by uri. The rkey used in postUrl is the last path segment of the post uri.

Nullable fields. Some fields can be null when the API omits them: on posts, postUrl, authorHandle, authorName, authorDid, and createdAt (counts default to 0, text to "", langs to []); on profiles, handle, displayName, description, avatar, banner, createdAt, and profileUrl (counts default to 0).

Diagnostics

If the run fails or returns nothing, a single ok:false row is pushed with an errorCode (BAD_INPUT, NO_RESULTS, RATE_LIMITED, SERVER_ERROR, NETWORK, …) and a human-readable error message. Diagnostic rows are never charged.

Troubleshooting. If you get a BAD_INPUT row, set searchQuery to a keyword or add at least one handle to authorHandles. A NO_RESULTS row means the API answered but had nothing for that query/author — Bluesky's public index is smaller and sparser than Twitter/X, so broad keywords may return few posts. If you see RATE_LIMITED from many parallel runs, enable the optional proxy or lower the volume.

Billing

Charged per unique post returned (post event). Profile rows and diagnostic rows are not charged.

API

Built on the public host https://api.bsky.app:

  • app.bsky.feed.searchPosts
  • app.bsky.feed.getAuthorFeed
  • app.bsky.actor.getProfile

All are public, cursor-paginated GET/JSON endpoints. We hit api.bsky.app directly rather than the documented public.api.bsky.app alias: the alias is fronted by BunnyCDN, which intermittently returns 403 for searchPosts in some regions, whereas api.bsky.app is the same public AppView served directly and is more reliable.