Request a tool
All toolsMCP serverRequest a toolPlatformsCategories
Package Registry Scraper (npm + PyPI) icon

Package Registry Scraper (npm + PyPI)

Get npm and PyPI package metadata - version, license, repo, keywords, and npm download counts. Search by keyword or look up exact names. No API key needed.

Run this in the cloudRun on Apify →

Developer & Research Tools

How it works

  1. 1
    Open it on Apify

    Hit Run on Apify — it opens the tool in the cloud, no install.

  2. 2
    Set the inputs

    Adjust registry, searchQuery, packageNames (sensible defaults are pre-filled).

  3. 3
    Click Run

    The tool runs on Apify’s cloud and collects the data for you.

  4. 4
    Export the results

    Download as JSON, CSV or Excel, or pipe straight into your app, Google Sheets, or an AI agent.

Inputs

FieldWhat it doesType
registryWhich package registry to use. npm supports both keyword search and exact-name lookup; PyPI supports exact-name lookup only (it has no clean public search API).string
searchQueryKeywords to search the npm registry for (e.g. "react state management"). npm only — ignored for PyPI. Leave empty if you are looking up exact package names inststring
packageNamesExact package names to look up directly. Works for BOTH registries. For PyPI this is the only supported mode (e.g. ["requests", "fastapi"]). For npm, scoped namarray
includeDownloadsFetch last-month download counts for each npm package via the npm downloads API. npm only — PyPI does not expose a public download-count endpoint. Adds one requboolean
maxItemsMaximum number of packages to return from an npm search query. Only applies to npm search; ignored for exact-name lookups.integer
notionConnectorOptional. Write each package as a page into your Notion when the run finishes. Authorize a Notion connector once in Settings → API & Integrations → MCP connectostring
notionParentIdOptional. The Notion data source ID of the database to write into (only used if a Notion connector is set). Leave empty to create the pages privately in your wostring

What you get

A structured dataset — each result includes fields like:

authordescriptionhomepagekeywordslicensemonthlyDownloadsnameregistryrepositoryscoreurlversion

Export every run as JSON, CSV or Excel, or send it to your app, a database, Google Sheets, or an AI agent.

2 ready-to-run use cases

npm Search Scraper: Downloads, License & Repo

Search the npm registry by keyword and compare each package's version, license, source repo, and monthly downloads side by side. Great for JS library research.

npm License Audit: Map package.json Deps

Feed your package.json dependencies and return the license and source repository for each npm package - a fast OSS compliance check for engineering teams.

Package Registry Scraper (npm + PyPI)

Pull clean, structured package metadata from the npm and PyPI public registries. No API key, no login, no anti-bot. Search npm by keyword, or look up exact packages by name on either registry — and get a single normalized shape back for both.

What you get per package

registry, name, version, description, author, homepage, repository (decoded to a browseable https URL), license, keywords, monthlyDownloads (npm), and url (the human-facing registry page).

Nullable fields. Registries don't always populate every field, so the following can be null: description, author, homepage, repository, license (and keywords may be an empty array). monthlyDownloads is null when includeDownloads is off, for PyPI packages (no public download endpoint), or if the npm downloads API call fails for a specific package — in which case a warning is logged so you know why.

Input

FieldNotes
registrynpm or pypi. Default npm.
searchQueryKeyword search — npm only (PyPI has no clean public search API).
packageNamesArray of exact names — works for both registries (e.g. ["requests","fastapi"], or npm scoped ["@types/node"]).
includeDownloadsFetch last-month download counts for npm packages. On by default. npm only.
maxItemsCap on npm search results. Default 50.

You must provide either a searchQuery (npm) or one or more packageNames.

Registries

  • npm — keyword search via the registry search API, package detail via registry.npmjs.org/{pkg}, and monthly downloads via api.npmjs.org.
  • PyPI — exact-name lookup via pypi.org/pypi/{pkg}/json. PyPI has no clean public search API, so a searchQuery with registry=pypi returns a single diagnostic row telling you to use packageNames instead.

Output

One dataset row per package, deduped by registry + name. Packages that can't be found, and empty searches, return a single diagnostic row (ok:false) and are not charged.

Pricing

Pay-per-result: you are charged once per successfully returned package row. Diagnostic rows (ok:false) — bad input, not-found packages, empty searches, network/registry errors — are never charged.

Proxy

These are public, no-auth registries with no anti-bot, so no proxy is needed — leave proxy off (the default). Only enable a proxy if you hit IP-based rate limits on very large runs.

Troubleshooting

  • Empty/bad-input run returns a BAD_INPUT diagnostic row: provide a searchQuery (npm) or one or more packageNames.
  • monthlyDownloads is null for some npm packages: the downloads API occasionally rate-limits or has no data for a package; the run logs a warning and continues with null for that field.
  • PyPI search isn't supported (PyPI has no clean public search API); use exact packageNames with registry=pypi.

Examples

Search npm:

{ "registry": "npm", "searchQuery": "react state management", "maxItems": 25, "includeDownloads": true }

Look up Python packages:

{ "registry": "pypi", "packageNames": ["requests", "fastapi"] }