Integrations/Integration / Node.js

Node.js Web Scraping API — fastCRW [Firecrawl-Compatible]

Scrape, crawl, map, and search from Node.js with fastCRW — a Firecrawl-compatible REST API backed by a single Rust binary. Use the official JS SDK or plain fetch. Concurrency-bounded batches, typed responses, AGPL-3.0 self-host free.

Published

June 13, 2026

Updated

June 13, 2026

Verdict

fastCRW is Firecrawl-compatible — the official @mendable/firecrawl-js SDK works against fastCRW by setting apiUrl to your instance URL. Every scrapeUrl, crawlUrl, mapUrl, and search call runs unchanged. Under the hood you get a single ~8 MB Rust binary instead of a multi-hundred-MB scraper container, and the highest truth-recall of three tools tested on Firecrawl's own public dataset: 63.74% of 819 labeled URLs (diagnose_3way.py, 2026-05-08).

Who This Is For

Node.js developers who want clean Markdown from the web — without parsing HTML yourself.
Teams already on Firecrawl JS — migrate with one apiUrl change, nothing else.
Backend engineers building enrichment APIs — expose a scrape endpoint in Express or Fastify.
AI pipeline builders — feed LLM-ready Markdown from scrapeUrl into embeddings or chat context.

Setup

1. Install

npm install @mendable/firecrawl-js
# or with bun:
bun add @mendable/firecrawl-js

2. Get an API key

export FASTCRW_API_KEY="fcrw_..."

The free tier ships 500 one-time lifetime credits. Plain scrape is 1 credit; crawl is 1 credit per page; search is 1 credit per query.

3. Create a singleton client

// src/client.ts
import FirecrawlApp from "@mendable/firecrawl-js";

export const app = new FirecrawlApp({
  apiKey: process.env.FASTCRW_API_KEY ?? "",
  apiUrl: process.env.FASTCRW_API_URL ?? "https://api.fastcrw.com",
  // For a local self-hosted engine: apiUrl: "http://localhost:3000"
});

Read apiUrl from an environment variable so dev (local Docker), staging, and production (managed cloud) differ by config, not code.

Quickstart: Scrape a Page

import { app } from "./client.js";

const doc = await app.scrapeUrl("https://example.com", {
  formats: ["markdown"],
  onlyMainContent: true,
});

if (!doc.success) {
  throw new Error(`scrape failed: ${doc.error}`);
}

console.log(doc.markdown);
console.log("title:", doc.metadata?.title);

Crawl a Site

import { app } from "./client.js";

const job = await app.crawlUrl("https://example.com", {
  limit: 25, // cap: 1000
  maxDepth: 2, // cap: 10
  scrapeOptions: { formats: ["markdown"], onlyMainContent: true },
});

if (!job.success) throw new Error(job.error);

for (const page of job.data) {
  const url = page.metadata?.sourceURL;
  const words = (page.markdown ?? "").split(/\s+/).length;
  console.log(`${words}\t${url}`);
}

Crawl is always async and bounded by limit / maxDepth — set both explicitly to keep credit spend predictable.

Map Site URLs

import { app } from "./client.js";

const res = await app.mapUrl("https://example.com");
if (!res.success) throw new Error(res.error);

console.log(`found ${res.links.length} urls`);
console.log(res.links.slice(0, 10).join("\n"));

Search the Web

import { app } from "./client.js";

const res = await app.search("fastCRW nodejs scraping api 2026", { limit: 5 });
if (!res.success) throw new Error(res.error);

for (const r of res.data) {
  console.log(r.title, "→", r.url);
}

Structured JSON Extraction

Pass formats: ["json"] with a JSON Schema to extract typed records instead of prose:

import { app } from "./client.js";

const schema = {
  type: "object",
  properties: {
    productName: { type: "string" },
    priceUsd: { type: "number", description: "Current price in USD" },
    inStock: { type: "boolean" },
  },
  required: ["productName", "priceUsd"],
} as const;

const res = await app.extract(["https://example.com/products/widget"], {
  prompt: "Extract the product name, current price, and stock status.",
  schema,
});

console.log(JSON.stringify(res.data, null, 2));

Cost: formats: ["json"] / extract is billed as 1 scrape credit plus the managed-LLM token cost for that request, settled from real usage, versus 1 credit for plain markdown. No selectors — describe the shape, and fastCRW's managed LLM reads the page semantically. LLM extraction is available on paid plans (the FREE plan has no LLM features).

Concurrency-Bounded Batch Scraping

Promise.all(urls.map(...)) fires all requests simultaneously — a thundering herd that floods fastCRW and trips site rate limits. Use a bounded pool instead:

// A dependency-free bounded concurrency pool
async function pool<T, R>(
  items: T[],
  limit: number,
  worker: (item: T) => Promise<R>,
): Promise<R[]> {
  const results: R[] = new Array(items.length);
  let next = 0;

  async function run(): Promise<void> {
    while (next < items.length) {
      const i = next++;
      results[i] = await worker(items[i]);
    }
  }

  await Promise.all(Array.from({ length: Math.min(limit, items.length) }, run));
  return results;
}

// Usage
import { app } from "./client.js";

const urls = [
  "https://example.com",
  "https://example.org",
  "https://docs.fastcrw.com",
];

const out = await pool(urls, 4, async (url) => {
  try {
    const d = await app.scrapeUrl(url, { formats: ["markdown"] });
    return { url, ok: d.success, chars: d.markdown?.length ?? 0 };
  } catch (e) {
    return { url, ok: false, error: String(e) };
  }
});

console.table(out);

Pick limit based on your downstream tolerance — 4–16 is sane for most workloads. Size it with evidence, not a round number: watch in-flight count and p90 latency, and raise the ceiling until error rate climbs.

Latency note: In fast mode on the 2026-05-08 benchmark (Firecrawl's public 1,000-URL dataset), fastCRW leads every percentile — p50 1361 ms, p90 4062 ms (ahead of Crawl4AI's 4754 ms and Firecrawl's 6937 ms), p99 5588 ms — while also returning the highest truth-recall of the three tools tested (63.74% of 819 labeled URLs, switchable higher in recall mode). Speed and accuracy are a single config flip. Full breakdown at /benchmarks/firecrawl-dataset.

Retry with Exponential Backoff

async function withRetry<T>(fn: () => Promise<T>, attempts = 3): Promise<T> {
  let delay = 1000;
  for (let i = 1; i <= attempts; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === attempts) throw e;
      await new Promise((r) => setTimeout(r, delay));
      delay *= 2;
    }
  }
  throw new Error("unreachable");
}

// Wrap any scrapeUrl call:
const doc = await withRetry(() =>
  app.scrapeUrl(url, { formats: ["markdown"] }),
);

Retry only transient failures (429, 5xx, network errors) — never retry a 400 or 422, which will fail identically every time.

Express Scraping Endpoint

import express from "express";
import { app as crw } from "./client.js";

const server = express();
server.use(express.json());

server.post("/scrape", async (req, res) => {
  const { url } = req.body ?? {};
  if (typeof url !== "string" || !/^https?:\/\//.test(url)) {
    return res.status(400).json({ error: "valid url required" });
  }
  try {
    const doc = await withRetry(() =>
      crw.scrapeUrl(url, { formats: ["markdown"], onlyMainContent: true }),
    );
    if (!doc.success) {
      return res.status(502).json({ error: doc.error });
    }
    res.json({
      url: doc.metadata?.sourceURL ?? url,
      title: doc.metadata?.title,
      markdown: doc.markdown,
    });
  } catch (e) {
    res.status(500).json({ error: String(e) });
  }
});

server.listen(8080, () => console.log("scrape API on :8080"));

Pattern: validate at the edge, reuse one client singleton, wrap in withRetry, translate failures to honest HTTP status codes. The same handler runs against local Docker in dev and managed cloud in prod — flip one environment variable.

MCP Setup

fastCRW ships an MCP server (crw-mcp on npm) so AI agents in Claude Code, Cursor, or Windsurf can call scrape, crawl, map, and search as MCP tools:

{
  "mcpServers": {
    "fastcrw": {
      "command": "npx",
      "args": ["-y", "crw-mcp@latest"],
      "env": {
        "FASTCRW_API_KEY": "fcrw_...",
        "FASTCRW_API_URL": "https://api.fastcrw.com"
      }
    }
  }
}

See /integrations/mcp for full configuration options.

Good to Know

Screenshots — formats: ["screenshot"] is served by /v2/scrape.
Batch scraping — iterate /v1/scrape concurrently (pool helper above), use /v1/crawl for site-wide jobs, or call /v2/batch/scrape directly for a single-request batch.
Stateless per request — no session is carried across calls; multi-step authenticated flows must be modeled in your own code.
LLM extraction — powered by the managed LLM, available on paid plans (the FREE plan has no LLM features).

fastCRWlive

Scrape any URL, live

Get 500 free credits →

Sources

fastCRW scrape docs

/docs/scrape

fastCRW 3-way scrape benchmark

/benchmarks/firecrawl-dataset

fastCRW search docs

/docs/search

@mendable/firecrawl-js on npm

https://www.npmjs.com/package/@mendable/firecrawl-js

FAQ

Which npm package do I install for fastCRW in Node.js?

The official @mendable/firecrawl-js SDK. Construct it with apiUrl pointing at your fastCRW instance (https://api.fastcrw.com for managed cloud, or http://localhost:3000 for a self-hosted Docker run). scrapeUrl, crawlUrl, mapUrl, search, and extract all work unchanged because fastCRW is Firecrawl-API-compatible on the scrape, crawl, map, and search surface.

How do I batch many URLs without overloading the target?

Use a bounded concurrency pool. The pool() helper keeps a fixed number of in-flight requests (4–16 is sane for most workloads), applies natural backpressure (a slow page delays only its worker, not the whole batch), and bounds memory because at most limit responses are materialized at once. Unbounded Promise.all opens a thundering herd that trips rate limits and balloons memory.

What is the p50 latency for fastCRW scrape requests from Node?

In fast mode on Firecrawl's public 1,000-URL scrape-content dataset (diagnose_3way.py, 2026-05-08), fastCRW leads every latency percentile: p50 1361 ms (41% faster than Firecrawl's 2305 ms), p90 4062 ms (ahead of Crawl4AI's 4754 ms and Firecrawl's 6937 ms), and p99 5588 ms. Speed and accuracy are a single config flip — switch to recall mode for maximum correct content. Set a per-request timeout and size your pool accordingly.

Can I self-host the fastCRW engine and use it from Node?

Yes. Run docker run -p 3000:3000 ghcr.io/us/crw:latest and set apiUrl to http://localhost:3000. Self-hosting under AGPL-3.0 is free — you pay only your server costs. The client code is identical; switch between local and cloud by changing one environment variable.

How do I expose fastCRW scraping as an Express endpoint?

Create a singleton FirecrawlApp client, add an Express route that validates the incoming URL, calls app.scrapeUrl wrapped in a retry helper, and maps fastCRW errors to HTTP status codes (400 for bad input, 502 when the scrape itself failed, 500 for unexpected errors). See the Express example below.

Recommended next step

Run a live scrape before you commit.

Use the hosted demo to test scrape, crawl, or map output with fastCRW semantics.

Try Playground

Continue exploring

More from Integrations

View all integrations

Previous in Integrations

Make Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Next in Integrations

TypeScript Web Scraping API — fastCRW [Firecrawl-Compatible]

Integrations

MCP Web Scraping Integration — fastCRW [Firecrawl-Compatible]

fastCRW ships an official MCP server (crw-mcp) exposing scrape, search, crawl, map, and extract to any MCP-compatible client. Small single static binary, local-first, self-host free under AGPL-3.0.

mcp web scrapingOfficial crw-mcp server, single npx command

Integrations

Google ADK Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Wire fastCRW into Google's Agent Development Kit as a FunctionTool. Firecrawl-compatible scrape and search, small single static binary, local-first, self-host free under AGPL-3.0.

google adk web scrapingFunctionTool wrappers for fastCRW scrape and search

Integrations

OpenAI Agents SDK Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Give OpenAI Agents SDK agents a fastCRW scrape and search tool with the @function_tool decorator. Small single static binary, local-first, Firecrawl-compatible API, self-host free under AGPL-3.0.

openai agents sdk web scrapingNative function_tool wrappers for scrape and search

Related hubs

Keep the crawl path moving

Docs

Drop into endpoint reference once your integration is wired up.

Use Cases

See where this integration shape fits common AI-agent workloads.

Alternatives

Compare fastCRW against other scraping APIs your stack might consider.