TypeScript Web Scraping API — fastCRW [Firecrawl-Compatible]
Type-safe web scraping with TypeScript and fastCRW — a Firecrawl-compatible REST API. Use Zod to derive types from JSON schemas, validate extraction output at the boundary, and catch schema drift at the hour it breaks. AGPL-3.0, self-host free.
Call fastCRW from TypeScript with the official Firecrawl JS SDK or typed fetch. Use one Zod schema as the single source of truth for the JSON extraction request, the TypeScript type, and the runtime validation gate — schema drift throws at the boundary instead of corrupting your database downstream.
Verdict
fastCRW is Firecrawl-compatible REST — the official @mendable/firecrawl-js SDK works against fastCRW by setting apiUrl. For TypeScript, the integration goes deeper than a base-URL swap: define one Zod schema, convert it to JSON Schema for the /v1/scrape request, and parse the response back through the same schema. You get the TypeScript type, the runtime validation gate, and the API extraction spec all from one source — schema drift throws at the boundary with the exact failing field, not silently three steps downstream.
Who This Is For
- TypeScript developers who want typed web data — extracted records that actually match the shape your code expects.
- Teams migrating off Firecrawl TS — one
apiUrlchange, nothing else. - AI pipeline engineers — structured extraction that feeds cleanly into typed service layers.
- Full-stack engineers adding web enrichment — a typed
fetchwrapper that integrates into existing Express / Fastify / tRPC services.
Setup
1. Install
npm install @mendable/firecrawl-js zod zod-to-json-schema
# or with bun:
bun add @mendable/firecrawl-js zod zod-to-json-schema
TypeScript types are included in @mendable/firecrawl-js. For zod-to-json-schema:
npm install -D @types/zod-to-json-schema # if needed
2. Get an API key
Sign up at fastcrw.com, copy the API key from the dashboard, and export it:
export FASTCRW_API_KEY="fcrw_..."
The fastCRW plans ships 500 one-time lifetime credits. Plain scrape is 1 credit; crawl is 1 credit per page; search is 1 credit per query.
3. Create a singleton client
// src/client.ts
import FirecrawlApp from "@mendable/firecrawl-js";
export const app = new FirecrawlApp({
apiKey: process.env.FASTCRW_API_KEY ?? "",
apiUrl: process.env.FASTCRW_API_URL ?? "https://api.fastcrw.com",
});
Quickstart: Scrape a Page to Markdown
import { app } from "./client.js";
const doc = await app.scrapeUrl("https://example.com", {
formats: ["markdown"],
onlyMainContent: true,
});
if (!doc.success) {
throw new Error(`scrape failed: ${doc.error}`);
}
const markdown: string = doc.markdown ?? "";
console.log(`scraped ${markdown.length} chars`);
console.log("title:", doc.metadata?.title);
Type-Safe JSON Extraction with Zod
This is where TypeScript's reliability promise is actually delivered. The pattern: one Zod schema drives the JSON extraction spec, the TypeScript type, and the runtime validation gate.
Step 1: Define the schema once
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
// Single source of truth — never duplicated by hand
const ProductSchema = z.object({
productName: z.string().min(1),
priceUsd: z.number().positive(),
inStock: z.boolean(),
imageUrl: z.string().url().optional(), // optional field — explicit, not accidental
});
// TypeScript type derived from the schema — never drifts from it
type Product = z.infer<typeof ProductSchema>;
// JSON Schema for the extraction request — generated, not hand-written
const jsonSchema = zodToJsonSchema(ProductSchema, { name: "Product" });
Step 2: Send the extraction request
import { app } from "./client.js";
const res = await app.scrapeUrl("https://example.com/products/widget", {
formats: ["json"],
jsonSchema,
});
if (!res.success) {
throw new Error(`extraction failed: ${res.error}`);
}
Step 3: Validate at the boundary
// .parse() throws ZodError naming the exact failing field — not a silent wrong value
const product: Product = ProductSchema.parse(res.data?.json);
// Use .safeParse() if you want to handle the error without throwing:
const result = ProductSchema.safeParse(res.data?.json);
if (!result.success) {
console.error("extraction schema mismatch:", result.error.flatten());
return;
}
const typed: Product = result.data; // fully typed, validated at runtime
When the target site renames a field or restructures the page, parse() rejects it immediately with the exact field that failed — you learn about the break the same hour it happens, not weeks later when downstream data is corrupted.
Cost:
formats: ["json"]is a 5-credit operation vs 1 credit for markdown. LLM extraction supports OpenAI and Anthropic providers only. There is no batch/v1/extractendpoint — iterate/v1/scrapeconcurrently or use/v1/crawl.
Typed Batch Scraping with a Concurrency Pool
import { app } from "./client.js";
interface ScrapeResult {
url: string;
ok: boolean;
chars: number;
error?: string;
}
// Dependency-free bounded concurrency pool
async function pool<T, R>(
items: T[],
limit: number,
worker: (item: T) => Promise<R>,
): Promise<R[]> {
const results: R[] = new Array(items.length);
let next = 0;
async function run(): Promise<void> {
while (next < items.length) {
const i = next++;
results[i] = await worker(items[i]);
}
}
await Promise.all(Array.from({ length: Math.min(limit, items.length) }, run));
return results;
}
const urls = [
"https://docs.fastcrw.com",
"https://fastcrw.com/pricing",
"https://fastcrw.com/alternatives/firecrawl",
];
const out: ScrapeResult[] = await pool(urls, 4, async (url): Promise<ScrapeResult> => {
try {
const d = await app.scrapeUrl(url, { formats: ["markdown"] });
return { url, ok: d.success, chars: d.markdown?.length ?? 0 };
} catch (e) {
return { url, ok: false, chars: 0, error: String(e) };
}
});
console.table(out);
Latency note: fastCRW's p50 was 1914 ms and p90 14157 ms on the 2026-05-08 benchmark (819 labeled URLs,
diagnose_3way.py). The wide tail is the chrome-stealth fallback that recovers hard pages — the same mechanism that gives fastCRW the highest truth-recall of three tools (63.74%). Size your pool limit and fetch timeouts from the p90, not the median. Full breakdown at /benchmarks/firecrawl-dataset.
Typed Crawl
import { app } from "./client.js";
const job = await app.crawlUrl("https://docs.fastcrw.com", {
limit: 25, // cap: 1000
maxDepth: 2, // cap: 10
scrapeOptions: { formats: ["markdown"], onlyMainContent: true },
});
if (!job.success) throw new Error(job.error);
// job.data is typed as FirecrawlDocument[]
const pages = job.data.map((page) => ({
url: page.metadata?.sourceURL ?? "",
words: (page.markdown ?? "").split(/\s+/).length,
}));
console.table(pages);
Typed Web Search
import { app } from "./client.js";
const res = await app.search("typescript web scraping api 2026", { limit: 5 });
if (!res.success) throw new Error(res.error);
// res.data is typed as SearchResult[]
for (const r of res.data) {
console.log(`${r.title} → ${r.url}`);
}
Using plain fetch with a typed wrapper
If you prefer to avoid the SDK:
const FASTCRW_BASE = process.env.FASTCRW_API_URL ?? "https://api.fastcrw.com";
const FASTCRW_KEY = process.env.FASTCRW_API_KEY ?? "";
interface ScrapeApiResponse<T = unknown> {
success: boolean;
data?: {
markdown?: string;
json?: T;
metadata?: { title?: string; sourceURL?: string; statusCode?: number };
};
error?: string;
}
async function scrape<T = string>(
url: string,
options: { formats: ("markdown" | "json")[]; jsonSchema?: object } = { formats: ["markdown"] },
): Promise<ScrapeApiResponse<T>> {
const res = await fetch(`${FASTCRW_BASE}/v1/scrape`, {
method: "POST",
headers: {
Authorization: `Bearer ${FASTCRW_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ url, onlyMainContent: true, ...options }),
signal: AbortSignal.timeout(25_000), // p90 is 14157 ms; timeout above it
});
if (!res.ok) throw new Error(`HTTP ${res.status}`);
return res.json() as Promise<ScrapeApiResponse<T>>;
}
MCP Setup
fastCRW ships an MCP server (crw-mcp on npm) for AI agents in Claude Code, Cursor, or Windsurf. Adds scrape, crawl, map, and search as agent-callable tools:
{
"mcpServers": {
"fastcrw": {
"command": "npx",
"args": ["-y", "crw-mcp@latest"],
"env": {
"FASTCRW_API_KEY": "fcrw_...",
"FASTCRW_API_URL": "https://api.fastcrw.com"
}
}
}
}
See /integrations/mcp for full configuration options.
Limits and Honest Gaps
- LLM extraction supports OpenAI and Anthropic providers only — if your stack standardizes on another model, extraction is the exception.
- Stateless per request — no session is carried across calls; multi-step authenticated flows must be reconstructed per request.
- No screenshot output —
formats: ["screenshot"]returns HTTP 422. - No batch
/v1/extract— iterate/v1/scrapeconcurrently (pool helper above) or use/v1/crawl. - Extraction is single-URL —
/v1/crawlhandles multi-URL bulk ingestion.
Related
Continue exploring
More from Integrations
Flowise Web Scraping Integration — fastCRW [Firecrawl-Compatible]
Add fastCRW to Flowise workflows with an HTTP node or custom tool definition. No-code web scraping for LangChain flows, RAG pipelines, and AI agents. Small single static binary, local-first, self-host free under AGPL-3.0.
Langflow Web Scraping Integration — fastCRW [Firecrawl-Compatible]
Add fastCRW to Langflow as a custom component or HTTP node. Firecrawl-compatible scrape and search, small single static binary, local-first, self-host free under AGPL-3.0.
Python Web Scraping API — fastCRW [Firecrawl-Compatible]
Scrape, crawl, and search the web from Python with fastCRW — a Firecrawl-compatible REST API backed by a single Rust binary. Async httpx, asyncio.TaskGroup, and the crw Python SDK. AGPL-3.0, self-host free.
Related hubs
