Skip to main content
Integrations/Integration / Node.js

Node.js Web Scraping API — fastCRW [Firecrawl-Compatible]

Scrape, crawl, map, and search from Node.js with fastCRW — a Firecrawl-compatible REST API backed by a single Rust binary. Use the official JS SDK or plain fetch. Concurrency-bounded batches, typed responses, AGPL-3.0 self-host free.

Published
June 13, 2026
Updated
June 13, 2026
Category
integrations
Verdict

Call fastCRW from Node.js with the official Firecrawl JS SDK — point apiUrl at your instance and every scrapeUrl, crawlUrl, mapUrl, and search call works unchanged. Add a bounded concurrency pool and exponential backoff for production batch jobs.

Official @mendable/firecrawl-js SDK works via apiUrl override — no new SDK to learnscrapeUrl / crawlUrl / mapUrl / search / extract all supportedConcurrency-bounded batch pool in ~15 lines — no external dependencies63.74% truth-recall on Firecrawl''s public 1,000-URL dataset (diagnose_3way.py, 2026-05-08) — highest of three tools testedSingle ~8 MB Rust binary — one container, no Redis, no Node.js in the engine

Verdict

fastCRW is Firecrawl-compatible — the official @mendable/firecrawl-js SDK works against fastCRW by setting apiUrl to your instance URL. Every scrapeUrl, crawlUrl, mapUrl, and search call runs unchanged. Under the hood you get a single ~8 MB Rust binary instead of a multi-hundred-MB scraper container, and the highest truth-recall of three tools tested on Firecrawl's own public dataset: 63.74% of 819 labeled URLs (diagnose_3way.py, 2026-05-08).

Who This Is For

  • Node.js developers who want clean Markdown from the web — without parsing HTML yourself.
  • Teams already on Firecrawl JS — migrate with one apiUrl change, nothing else.
  • Backend engineers building enrichment APIs — expose a scrape endpoint in Express or Fastify.
  • AI pipeline builders — feed LLM-ready Markdown from scrapeUrl into embeddings or chat context.

Setup

1. Install

npm install @mendable/firecrawl-js
# or with bun:
bun add @mendable/firecrawl-js

2. Get an API key

Sign up at fastcrw.com, copy the API key from the dashboard, and export it:

export FASTCRW_API_KEY="fcrw_..."

The fastCRW pricing ships 500 one-time lifetime credits. Plain scrape is 1 credit; crawl is 1 credit per page; search is 1 credit per query.

3. Create a singleton client

// src/client.ts
import FirecrawlApp from "@mendable/firecrawl-js";

export const app = new FirecrawlApp({
  apiKey: process.env.FASTCRW_API_KEY ?? "",
  apiUrl: process.env.FASTCRW_API_URL ?? "https://api.fastcrw.com",
  // For a local self-hosted engine: apiUrl: "http://localhost:3000"
});

Read apiUrl from an environment variable so dev (local Docker), staging, and production (managed cloud) differ by config, not code.

Quickstart: Scrape a Page

import { app } from "./client.js";

const doc = await app.scrapeUrl("https://example.com", {
  formats: ["markdown"],
  onlyMainContent: true,
});

if (!doc.success) {
  throw new Error(`scrape failed: ${doc.error}`);
}

console.log(doc.markdown);
console.log("title:", doc.metadata?.title);

Crawl a Site

import { app } from "./client.js";

const job = await app.crawlUrl("https://example.com", {
  limit: 25,       // cap: 1000
  maxDepth: 2,     // cap: 10
  scrapeOptions: { formats: ["markdown"], onlyMainContent: true },
});

if (!job.success) throw new Error(job.error);

for (const page of job.data) {
  const url = page.metadata?.sourceURL;
  const words = (page.markdown ?? "").split(/\s+/).length;
  console.log(`${words}\t${url}`);
}

Crawl is always async and bounded by limit / maxDepth — set both explicitly to keep credit spend predictable.

Map Site URLs

import { app } from "./client.js";

const res = await app.mapUrl("https://example.com");
if (!res.success) throw new Error(res.error);

console.log(`found ${res.links.length} urls`);
console.log(res.links.slice(0, 10).join("\n"));

Search the Web

import { app } from "./client.js";

const res = await app.search("fastCRW nodejs scraping api 2026", { limit: 5 });
if (!res.success) throw new Error(res.error);

for (const r of res.data) {
  console.log(r.title, "→", r.url);
}

Structured JSON Extraction

Pass formats: ["json"] with a JSON Schema to extract typed records instead of prose:

import { app } from "./client.js";

const schema = {
  type: "object",
  properties: {
    productName: { type: "string" },
    priceUsd: { type: "number", description: "Current price in USD" },
    inStock: { type: "boolean" },
  },
  required: ["productName", "priceUsd"],
} as const;

const res = await app.extract(["https://example.com/products/widget"], {
  prompt: "Extract the product name, current price, and stock status.",
  schema,
});

console.log(JSON.stringify(res.data, null, 2));

Cost: formats: ["json"] / extract is a 5-credit operation vs 1 credit for markdown. No selectors — describe the shape, and fastCRW's LLM reads the page semantically. LLM extraction supports OpenAI and Anthropic providers only.

Concurrency-Bounded Batch Scraping

Promise.all(urls.map(...)) fires all requests simultaneously — a thundering herd that floods fastCRW and trips site rate limits. Use a bounded pool instead:

// A dependency-free bounded concurrency pool
async function pool<T, R>(
  items: T[],
  limit: number,
  worker: (item: T) => Promise<R>,
): Promise<R[]> {
  const results: R[] = new Array(items.length);
  let next = 0;

  async function run(): Promise<void> {
    while (next < items.length) {
      const i = next++;
      results[i] = await worker(items[i]);
    }
  }

  await Promise.all(
    Array.from({ length: Math.min(limit, items.length) }, run),
  );
  return results;
}

// Usage
import { app } from "./client.js";

const urls = [
  "https://example.com",
  "https://example.org",
  "https://docs.fastcrw.com",
];

const out = await pool(urls, 4, async (url) => {
  try {
    const d = await app.scrapeUrl(url, { formats: ["markdown"] });
    return { url, ok: d.success, chars: d.markdown?.length ?? 0 };
  } catch (e) {
    return { url, ok: false, error: String(e) };
  }
});

console.table(out);

Pick limit based on your downstream tolerance — 4–16 is sane for most workloads. Size it with evidence, not a round number: watch in-flight count and p90 latency, and raise the ceiling until error rate climbs.

Latency note: fastCRW's p50 was 1914 ms and p90 14157 ms on the 2026-05-08 benchmark (819 labeled URLs). The p90 tail is the chrome-stealth fallback that recovers hard pages — the same mechanism that gives fastCRW the highest truth-recall of three tools tested. Size your pool and timeouts from the p90, not the median. Full breakdown at /benchmarks/firecrawl-dataset.

Retry with Exponential Backoff

async function withRetry<T>(fn: () => Promise<T>, attempts = 3): Promise<T> {
  let delay = 1000;
  for (let i = 1; i <= attempts; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === attempts) throw e;
      await new Promise((r) => setTimeout(r, delay));
      delay *= 2;
    }
  }
  throw new Error("unreachable");
}

// Wrap any scrapeUrl call:
const doc = await withRetry(() =>
  app.scrapeUrl(url, { formats: ["markdown"] }),
);

Retry only transient failures (429, 5xx, network errors) — never retry a 400 or 422, which will fail identically every time.

Express Scraping Endpoint

import express from "express";
import { app as crw } from "./client.js";

const server = express();
server.use(express.json());

server.post("/scrape", async (req, res) => {
  const { url } = req.body ?? {};
  if (typeof url !== "string" || !/^https?:\/\//.test(url)) {
    return res.status(400).json({ error: "valid url required" });
  }
  try {
    const doc = await withRetry(() =>
      crw.scrapeUrl(url, { formats: ["markdown"], onlyMainContent: true }),
    );
    if (!doc.success) {
      return res.status(502).json({ error: doc.error });
    }
    res.json({
      url: doc.metadata?.sourceURL ?? url,
      title: doc.metadata?.title,
      markdown: doc.markdown,
    });
  } catch (e) {
    res.status(500).json({ error: String(e) });
  }
});

server.listen(8080, () => console.log("scrape API on :8080"));

Pattern: validate at the edge, reuse one client singleton, wrap in withRetry, translate failures to honest HTTP status codes. The same handler runs against local Docker in dev and managed cloud in prod — flip one environment variable.

MCP Setup

fastCRW ships an our MCP integration (crw-mcp on npm) so AI agents in Claude Code, Cursor, or Windsurf can call scrape, crawl, map, and search as MCP tools:

{
  "mcpServers": {
    "fastcrw": {
      "command": "npx",
      "args": ["-y", "crw-mcp@latest"],
      "env": {
        "FASTCRW_API_KEY": "fcrw_...",
        "FASTCRW_API_URL": "https://api.fastcrw.com"
      }
    }
  }
}

See /integrations/mcp for full configuration options.

Limits and Honest Gaps

  • No screenshot outputformats: ["screenshot"] returns HTTP 422.
  • Stateless per request — no session is carried across calls; multi-step authenticated flows must be modeled in your own code.
  • LLM extraction — supports OpenAI and Anthropic providers only.
  • No /v1/batch/scrape — iterate /v1/scrape concurrently (pool helper above) or use /v1/crawl.

Continue exploring

More from Integrations

View all integrations

Related hubs

Keep the crawl path moving