Skip to main content
Integrations/Integration / TypeScript

TypeScript Web Scraping API — fastCRW [Firecrawl-Compatible]

Type-safe web scraping with TypeScript and fastCRW — a Firecrawl-compatible REST API. Use Zod to derive types from JSON schemas, validate extraction output at the boundary, and catch schema drift at the hour it breaks. AGPL-3.0, self-host free.

Published
June 13, 2026
Updated
June 13, 2026
Category
integrations
Verdict

Call fastCRW from TypeScript with the official Firecrawl JS SDK or typed fetch. Use one Zod schema as the single source of truth for the JSON extraction request, the TypeScript type, and the runtime validation gate — schema drift throws at the boundary instead of corrupting your database downstream.

Official @mendable/firecrawl-js SDK with full TypeScript types via apiUrl overrideZod schema as single source of truth — generates both the JSON extraction schema and the TypeScript typeRuntime validation at the boundary catches broken selectors and schema drift immediately63.74% truth-recall on Firecrawl''s public 1,000-URL dataset (diagnose_3way.py, 2026-05-08) — highest of three tools testedSingle ~8 MB Rust binary — no browser fleet to operate for JS-rendered pages

Verdict

fastCRW is Firecrawl-compatible REST — the official @mendable/firecrawl-js SDK works against fastCRW by setting apiUrl. For TypeScript, the integration goes deeper than a base-URL swap: define one Zod schema, convert it to JSON Schema for the /v1/scrape request, and parse the response back through the same schema. You get the TypeScript type, the runtime validation gate, and the API extraction spec all from one source — schema drift throws at the boundary with the exact failing field, not silently three steps downstream.

Who This Is For

  • TypeScript developers who want typed web data — extracted records that actually match the shape your code expects.
  • Teams migrating off Firecrawl TS — one apiUrl change, nothing else.
  • AI pipeline engineers — structured extraction that feeds cleanly into typed service layers.
  • Full-stack engineers adding web enrichment — a typed fetch wrapper that integrates into existing Express / Fastify / tRPC services.

Setup

1. Install

npm install @mendable/firecrawl-js zod zod-to-json-schema
# or with bun:
bun add @mendable/firecrawl-js zod zod-to-json-schema

TypeScript types are included in @mendable/firecrawl-js. For zod-to-json-schema:

npm install -D @types/zod-to-json-schema   # if needed

2. Get an API key

Sign up at fastcrw.com, copy the API key from the dashboard, and export it:

export FASTCRW_API_KEY="fcrw_..."

The fastCRW plans ships 500 one-time lifetime credits. Plain scrape is 1 credit; crawl is 1 credit per page; search is 1 credit per query.

3. Create a singleton client

// src/client.ts
import FirecrawlApp from "@mendable/firecrawl-js";

export const app = new FirecrawlApp({
  apiKey: process.env.FASTCRW_API_KEY ?? "",
  apiUrl: process.env.FASTCRW_API_URL ?? "https://api.fastcrw.com",
});

Quickstart: Scrape a Page to Markdown

import { app } from "./client.js";

const doc = await app.scrapeUrl("https://example.com", {
  formats: ["markdown"],
  onlyMainContent: true,
});

if (!doc.success) {
  throw new Error(`scrape failed: ${doc.error}`);
}

const markdown: string = doc.markdown ?? "";
console.log(`scraped ${markdown.length} chars`);
console.log("title:", doc.metadata?.title);

Type-Safe JSON Extraction with Zod

This is where TypeScript's reliability promise is actually delivered. The pattern: one Zod schema drives the JSON extraction spec, the TypeScript type, and the runtime validation gate.

Step 1: Define the schema once

import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";

// Single source of truth — never duplicated by hand
const ProductSchema = z.object({
  productName: z.string().min(1),
  priceUsd: z.number().positive(),
  inStock: z.boolean(),
  imageUrl: z.string().url().optional(), // optional field — explicit, not accidental
});

// TypeScript type derived from the schema — never drifts from it
type Product = z.infer<typeof ProductSchema>;

// JSON Schema for the extraction request — generated, not hand-written
const jsonSchema = zodToJsonSchema(ProductSchema, { name: "Product" });

Step 2: Send the extraction request

import { app } from "./client.js";

const res = await app.scrapeUrl("https://example.com/products/widget", {
  formats: ["json"],
  jsonSchema,
});

if (!res.success) {
  throw new Error(`extraction failed: ${res.error}`);
}

Step 3: Validate at the boundary

// .parse() throws ZodError naming the exact failing field — not a silent wrong value
const product: Product = ProductSchema.parse(res.data?.json);

// Use .safeParse() if you want to handle the error without throwing:
const result = ProductSchema.safeParse(res.data?.json);
if (!result.success) {
  console.error("extraction schema mismatch:", result.error.flatten());
  return;
}
const typed: Product = result.data; // fully typed, validated at runtime

When the target site renames a field or restructures the page, parse() rejects it immediately with the exact field that failed — you learn about the break the same hour it happens, not weeks later when downstream data is corrupted.

Cost: formats: ["json"] is a 5-credit operation vs 1 credit for markdown. LLM extraction supports OpenAI and Anthropic providers only. There is no batch /v1/extract endpoint — iterate /v1/scrape concurrently or use /v1/crawl.

Typed Batch Scraping with a Concurrency Pool

import { app } from "./client.js";

interface ScrapeResult {
  url: string;
  ok: boolean;
  chars: number;
  error?: string;
}

// Dependency-free bounded concurrency pool
async function pool<T, R>(
  items: T[],
  limit: number,
  worker: (item: T) => Promise<R>,
): Promise<R[]> {
  const results: R[] = new Array(items.length);
  let next = 0;

  async function run(): Promise<void> {
    while (next < items.length) {
      const i = next++;
      results[i] = await worker(items[i]);
    }
  }

  await Promise.all(Array.from({ length: Math.min(limit, items.length) }, run));
  return results;
}

const urls = [
  "https://docs.fastcrw.com",
  "https://fastcrw.com/pricing",
  "https://fastcrw.com/alternatives/firecrawl",
];

const out: ScrapeResult[] = await pool(urls, 4, async (url): Promise<ScrapeResult> => {
  try {
    const d = await app.scrapeUrl(url, { formats: ["markdown"] });
    return { url, ok: d.success, chars: d.markdown?.length ?? 0 };
  } catch (e) {
    return { url, ok: false, chars: 0, error: String(e) };
  }
});

console.table(out);

Latency note: fastCRW's p50 was 1914 ms and p90 14157 ms on the 2026-05-08 benchmark (819 labeled URLs, diagnose_3way.py). The wide tail is the chrome-stealth fallback that recovers hard pages — the same mechanism that gives fastCRW the highest truth-recall of three tools (63.74%). Size your pool limit and fetch timeouts from the p90, not the median. Full breakdown at /benchmarks/firecrawl-dataset.

Typed Crawl

import { app } from "./client.js";

const job = await app.crawlUrl("https://docs.fastcrw.com", {
  limit: 25,     // cap: 1000
  maxDepth: 2,   // cap: 10
  scrapeOptions: { formats: ["markdown"], onlyMainContent: true },
});

if (!job.success) throw new Error(job.error);

// job.data is typed as FirecrawlDocument[]
const pages = job.data.map((page) => ({
  url: page.metadata?.sourceURL ?? "",
  words: (page.markdown ?? "").split(/\s+/).length,
}));

console.table(pages);
import { app } from "./client.js";

const res = await app.search("typescript web scraping api 2026", { limit: 5 });
if (!res.success) throw new Error(res.error);

// res.data is typed as SearchResult[]
for (const r of res.data) {
  console.log(`${r.title} → ${r.url}`);
}

Using plain fetch with a typed wrapper

If you prefer to avoid the SDK:

const FASTCRW_BASE = process.env.FASTCRW_API_URL ?? "https://api.fastcrw.com";
const FASTCRW_KEY = process.env.FASTCRW_API_KEY ?? "";

interface ScrapeApiResponse<T = unknown> {
  success: boolean;
  data?: {
    markdown?: string;
    json?: T;
    metadata?: { title?: string; sourceURL?: string; statusCode?: number };
  };
  error?: string;
}

async function scrape<T = string>(
  url: string,
  options: { formats: ("markdown" | "json")[]; jsonSchema?: object } = { formats: ["markdown"] },
): Promise<ScrapeApiResponse<T>> {
  const res = await fetch(`${FASTCRW_BASE}/v1/scrape`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${FASTCRW_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ url, onlyMainContent: true, ...options }),
    signal: AbortSignal.timeout(25_000), // p90 is 14157 ms; timeout above it
  });

  if (!res.ok) throw new Error(`HTTP ${res.status}`);
  return res.json() as Promise<ScrapeApiResponse<T>>;
}

MCP Setup

fastCRW ships an MCP server (crw-mcp on npm) for AI agents in Claude Code, Cursor, or Windsurf. Adds scrape, crawl, map, and search as agent-callable tools:

{
  "mcpServers": {
    "fastcrw": {
      "command": "npx",
      "args": ["-y", "crw-mcp@latest"],
      "env": {
        "FASTCRW_API_KEY": "fcrw_...",
        "FASTCRW_API_URL": "https://api.fastcrw.com"
      }
    }
  }
}

See /integrations/mcp for full configuration options.

Limits and Honest Gaps

  • LLM extraction supports OpenAI and Anthropic providers only — if your stack standardizes on another model, extraction is the exception.
  • Stateless per request — no session is carried across calls; multi-step authenticated flows must be reconstructed per request.
  • No screenshot outputformats: ["screenshot"] returns HTTP 422.
  • No batch /v1/extract — iterate /v1/scrape concurrently (pool helper above) or use /v1/crawl.
  • Extraction is single-URL/v1/crawl handles multi-URL bulk ingestion.

Continue exploring

More from Integrations

View all integrations

Related hubs

Keep the crawl path moving