Short Answer
If you need a rich, hosted product with screenshots, PDF extraction, and a polished UI — Firecrawl is the safer bet. If you want a lightweight, Firecrawl-compatible API that you can self-host on a $5 VPS with minimal overhead — CRW is the better fit.
- Better fit for lightweight self-hosting: CRW (6.6 MB idle RAM, single binary)
- Better fit for managed cloud with document support: Firecrawl
- Better fit for AI agents via MCP: CRW (built-in, zero config)
- Better fit for screenshots & PDFs: Firecrawl
| CRW | Firecrawl | |
|---|---|---|
| Average latency (500 URLs) | 833 ms | 4,600 ms |
| Crawl coverage | 92% | 77.2% |
| Idle RAM | 6.6 MB | 500 MB+ |
| Docker image size | ~8 MB | 500 MB+ |
| Self-host cost | $0 (AGPL-3.0) | Requires larger infra |
| Firecrawl API compatible | ✅ Yes | ✅ Native |
| MCP server built-in | ✅ Yes | Separate package |
| Screenshot support | ❌ Roadmap | ✅ Yes |
| PDF extraction | ❌ Roadmap | ✅ Yes |
| Anti-bot (best-in-class) | Partial | Better |
| Firecrawl JS/Python SDK | ✅ Works (change apiUrl) | ✅ Native |
| Structured extraction | ✅ JSON schema | ✅ JSON schema |
| License | AGPL-3.0 | AGPL-3.0 (self-host) |
What Is CRW?
CRW is an open-source web scraping and crawling API written in Rust. It exposes a Firecrawl-compatible REST API — same endpoints, same request/response format — but runs on a dramatically smaller footprint. The core binary weighs around 8 MB, idles at 6.6 MB of RAM, and deploys via a single docker run.
Key capabilities: scrape pages to clean markdown, crawl multi-page sites, map site structure, extract structured JSON via LLM schemas, and serve as an MCP tool for AI agents — all from one binary. Because it is a Firecrawl-compatible API, any tooling already written for Firecrawl works against CRW without code changes beyond a URL swap.
The project is AGPL-3.0 licensed, meaning the self-hosted version is free to use. fastCRW is the managed cloud version for teams that don't want to operate their own infrastructure.
What Is Firecrawl?
Firecrawl is a commercial web scraping API (with a self-hostable open-source version under AGPL-3.0) that supports scraping, crawling, structured extraction, screenshot capture, and PDF/DOCX parsing. It runs on a Node.js + Playwright stack, which gives it broad browser-automation capability at the cost of a heavier runtime.
Firecrawl has a polished hosted offering, a well-documented SDK in multiple languages, and a large community of integrations. For teams that need a complete, mature product and are happy to pay for cloud hosting, it's a solid choice. The self-hosted path is more involved — requiring Redis, Playwright, and Chromium — but it is possible if you need the full feature set without cloud billing.
Performance: Where CRW Has an Edge
In benchmarks against the same 500-URL corpus (via Scrapeway benchmark data), CRW returned results 5.5× faster on average than Firecrawl — 833 ms vs 4,600 ms. Crawl coverage was also higher in our tests: 92% vs 77.2%.
The performance gap comes from the underlying engine. CRW uses lol-html, a streaming HTML parser designed for speed, rather than a full browser. For the vast majority of content — articles, docs, product pages, news — lol-html parses faster and uses far less memory than a headless browser.
Under load, the difference compounds. Because CRW doesn't spawn browser processes, it can handle many more concurrent requests on the same hardware without memory pressure climbing to unsustainable levels.
The tradeoff: JavaScript-heavy SPAs that require full browser execution are handled via CRW's LightPanda integration, which is newer and less battle-tested than Playwright. For highly complex client-side routing or login flows, Firecrawl's Playwright foundation may be more reliable today. See our detailed benchmark post for full methodology and results.
Deployment: Side-by-Side Setup Commands
This is one of the most concrete differences between the two. Here's what it actually takes to get each one running.
Firecrawl Self-Host (Docker Compose)
Firecrawl's self-hosted setup requires cloning the repo, configuring environment variables, and standing up multiple services — Redis, the API server, and Playwright workers. A realistic minimum VM is 1–2 GB RAM; the Docker image alone is 500 MB+.
# Step 1: clone the repo
git clone https://github.com/mendableai/firecrawl.git
cd firecrawl
# Step 2: copy and edit the environment file
cp .env.example .env
# Edit .env: set OPENAI_API_KEY, PORT, etc.
# Step 3: bring up all services (Redis + API + workers)
docker compose up -d
# Requires: Redis, Playwright, Chromium — roughly 2 GB RAM at idle
CRW Self-Host (Single Docker Command)
CRW deploys with one command. The image is ~8 MB. Idle RAM on a fresh start is 6.6 MB. A $5 DigitalOcean droplet handles it without issues.
docker run -p 3000:3000 ghcr.io/us/crw:latest
CRW with Docker Compose (If You Prefer Compose)
If you want a compose file for CRW — for example, to add it alongside your existing stack — it's straightforward:
version: "3.8"
services:
crw:
image: ghcr.io/us/crw:latest
ports:
- "3000:3000"
environment:
- CRW_API_KEY=your-secret-key
restart: unless-stopped
No Redis. No Playwright. No Chromium. For teams running scraping as a sidecar to their main application, this difference is operationally significant.
Firecrawl SDK Compatibility
One of the most practical aspects of CRW being a Firecrawl-compatible API is that the official Firecrawl SDKs — the npm package @mendable/firecrawl-js and the Python package firecrawl-py — work directly with CRW by changing a single parameter.
TypeScript / Node.js SDK
import FirecrawlApp from "@mendable/firecrawl-js";
// Point the SDK at your CRW instance instead of api.firecrawl.dev
const app = new FirecrawlApp({
apiKey: "your-crw-api-key",
apiUrl: "https://fastcrw.com/api", // or http://localhost:3000 for self-hosted
});
const result = await app.scrapeUrl("https://example.com", {
formats: ["markdown"],
});
console.log(result.markdown);
That's it. The apiUrl parameter routes all SDK calls to your CRW instance. Every method — scrapeUrl, crawlUrl, mapUrl — works without further changes.
Python SDK
from firecrawl import FirecrawlApp
# Point the SDK at your CRW instance
app = FirecrawlApp(
api_key="your-crw-api-key",
api_url="https://fastcrw.com/api", # or http://localhost:3000 for self-hosted
)
result = app.scrape_url("https://example.com", params={"formats": ["markdown"]})
print(result["markdown"])
The Python SDK uses api_url (with an underscore). Same idea. Any existing Python scraping code that calls Firecrawl will work against CRW after this one-line change.
The SDK compatibility is possible because CRW deliberately implements the Firecrawl REST interface at the HTTP level. There's no magic — the same JSON in, same JSON out. This means SDK updates and new Firecrawl SDK versions continue to work as long as the endpoint contracts haven't changed.
API Compatibility: Drop-In for Firecrawl Users
CRW implements the same /v1/scrape, /v1/crawl, and /v1/map endpoints with identical request/response shapes. If you're calling Firecrawl directly via curl or fetch, migration is a one-line change:
// Before
const BASE_URL = "https://api.firecrawl.dev";
// After
const BASE_URL = "https://fastcrw.com/api"; // or http://localhost:3000 for self-hosted
Your existing SDK calls, error handling, and response parsing all continue to work without modification. The JSON response schema is the same: success, data, markdown, metadata fields all appear in the same places.
How to Migrate from Firecrawl to CRW
Migration is intentionally low-friction for most use cases. Here's a step-by-step breakdown covering every common integration pattern.
Step 1: Start CRW
docker run -p 3000:3000 -e CRW_API_KEY=my-secret ghcr.io/us/crw:latest
Step 2a: Migrate curl calls
# Before (Firecrawl cloud)
curl -X POST https://api.firecrawl.dev/v1/scrape -H "Authorization: Bearer fc-YOUR_KEY" -H "Content-Type: application/json" -d '{"url": "https://example.com", "formats": ["markdown"]}'
# After (CRW cloud)
curl -X POST https://fastcrw.com/api/v1/scrape -H "Authorization: Bearer fc-YOUR_API_KEY" -H "Content-Type: application/json" -d '{"url": "https://example.com", "formats": ["markdown"]}'
Step 2b: Migrate TypeScript SDK
// Before
const app = new FirecrawlApp({ apiKey: "fc-YOUR_KEY" });
// After
const app = new FirecrawlApp({
apiKey: "my-secret",
apiUrl: "https://fastcrw.com/api", // or http://localhost:3000 for self-hosted
});
Step 2c: Migrate Python SDK
# Before
app = FirecrawlApp(api_key="fc-YOUR_KEY")
# After
app = FirecrawlApp(
api_key="fc-YOUR_API_KEY",
api_url="https://fastcrw.com/api", # or http://localhost:3000 for self-hosted
)
Step 2d: Environment variable approach
If your code already reads the base URL from an environment variable, migration is a config-only change — no code touches needed:
# .env (before)
FIRECRAWL_API_URL=https://api.firecrawl.dev
FIRECRAWL_API_KEY=fc-YOUR_KEY
# .env (after)
FIRECRAWL_API_URL=https://fastcrw.com/api
FIRECRAWL_API_KEY=fc-YOUR_API_KEY
What works after migration
- ✅
/v1/scrape— full markdown, HTML, links, metadata - ✅
/v1/crawl— multi-page crawling with depth and path filters - ✅
/v1/map— site structure mapping - ✅
formats: ["markdown", "html", "links", "extract"] - ✅ Firecrawl JS and Python SDKs with
apiUrl/api_url
What breaks (be aware before migrating)
- ❌
formats: ["screenshot"]— CRW does not yet support screenshot capture. If your workflow depends on this, keep routing those calls to Firecrawl or a separate service. - ❌ PDF and DOCX parsing — CRW handles HTML only. Document URLs will need an alternative path.
- ⚠️ Complex SPAs with heavy client-side rendering — CRW's LightPanda integration is functional but less mature than Playwright for very dynamic sites.
For most scraping workflows — articles, documentation sites, product pages, news — migration is safe and the only change is the URL.
Structured Extraction: LLM Schema Comparison
Both CRW and Firecrawl support structured JSON extraction via an extract format, where you pass a JSON schema and the API returns typed, structured data. The request body format is identical between the two.
Here's an example that extracts product data from an e-commerce page. This exact request body works on both Firecrawl and CRW:
{
"url": "https://example.com/product",
"formats": ["extract"],
"extract": {
"schema": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "number" },
"description": { "type": "string" },
"in_stock": { "type": "boolean" },
"images": {
"type": "array",
"items": { "type": "string" }
}
},
"required": ["name", "price"]
}
}
}
The response shape is also identical — structured data is returned under data.extract in the response JSON.
This is one of the areas where the compatibility is most complete. If you've already invested in writing extraction schemas for Firecrawl, they port directly to CRW with no changes. This matters for teams building RAG pipelines or AI agents that depend on structured outputs — the schema work doesn't need to be redone.
In TypeScript with the Firecrawl SDK pointed at CRW:
const result = await app.scrapeUrl("https://example.com/product", {
formats: ["extract"],
extract: {
schema: {
type: "object",
properties: {
name: { type: "string" },
price: { type: "number" },
description: { type: "string" },
},
required: ["name", "price"],
},
},
});
console.log(result.extract?.name);
console.log(result.extract?.price);
MCP Support: CRW Has the Edge for AI Workflows
CRW ships with a built-in MCP server. Configure it in Claude Desktop or Cursor with a single JSON stanza and your AI agent gains live web scraping capabilities — no separate package, no extra install step.
{
"mcpServers": {
"crw": {
"command": "docker",
"args": ["run", "--rm", "-i", "ghcr.io/us/crw:latest", "mcp"]
}
}
}
Firecrawl offers an MCP integration too, but it's a separate npm package (@mendableai/firecrawl-mcp) with its own setup process. For teams already running CRW, not needing an additional package is a small but real convenience.
When to Stay on Firecrawl
CRW isn't the right answer for every team. Here are the cases where sticking with Firecrawl is the more honest recommendation:
- You need screenshots: Firecrawl captures full-page screenshots via Playwright. CRW does not support this yet (it's on the roadmap, but not shipped). If screenshots are a core part of your workflow, Firecrawl is the only choice between these two.
- You need PDF or DOCX parsing: Firecrawl extracts content from document formats. CRW handles HTML only. If your data sources include PDFs — research papers, reports, whitepapers — Firecrawl handles this natively.
- You need more mature anti-bot handling: Firecrawl has more battle-tested proxy rotation and CAPTCHA handling. CRW's anti-bot layer is functional but not the most sophisticated option on the market. For heavily protected sites, Firecrawl or a dedicated proxy service is more reliable.
- You already have a working Firecrawl cloud account: If your existing Firecrawl setup is working well, the cost of migration — even if small — may not be worth it unless you have a specific reason to switch (cost, latency, self-hosting requirements).
- You want enterprise support: Firecrawl offers paid enterprise plans with SLAs and dedicated support. CRW is community-supported (with commercial support available via fastCRW), but the Firecrawl enterprise offering is more mature.
- You're deep in the Node.js ecosystem: Firecrawl's SDK and integrations are JavaScript-native and well-maintained. If your team is Node.js-first and values a large library of community examples and integrations, Firecrawl has more of that today.
Where Firecrawl Is Still Better — Summary
- Screenshots: Firecrawl can capture full-page screenshots. CRW does not support this yet.
- PDF/DOCX extraction: Firecrawl parses document formats. CRW currently handles HTML only.
- Anti-bot bypass: Firecrawl has more mature proxy rotation and CAPTCHA handling. CRW's anti-bot layer is functional but not the strongest on the market.
- Hosted product: If you want a fully managed cloud with proxy networks and auto-scaling, fastCRW is available, but Firecrawl's hosted product is more mature and has been around longer.
- Ecosystem maturity: Firecrawl has more SDKs, third-party integrations, and community examples.
Who Should Use Which
- Use Firecrawl if: you need screenshots, PDF/DOCX, strong anti-bot, or want the most polished hosted product. Also stay if your current setup is working and you don't have a pressing reason to switch.
- Use CRW (self-hosted) if: you want a lightweight, open-source Firecrawl-compatible API on your own infrastructure, with low memory overhead and cost control. Good for AI pipelines, RAG, and scraping-as-a-sidecar setups.
- Use fastCRW (cloud) if: you want the CRW engine hosted for you without managing servers — same Firecrawl-compatible API, with proxy networks and auto-scaling handled.
Bottom Line
CRW is not trying to replace Firecrawl across every dimension. It's a better fit for a specific set of use cases: lightweight self-hosting on constrained hardware, AI agent pipelines via MCP, RAG workflows, and situations where low overhead and operational simplicity matter more than document parsing or screenshots.
The fact that it's a drop-in Firecrawl-compatible API means you don't have to choose upfront or commit fully — you can run both in parallel, route different request types to different services, and migrate incrementally as your confidence grows.
For those use cases, in our benchmarks, CRW does the job faster and cheaper. For everything else, Firecrawl remains a solid option.
Try CRW
Open-Source Path — Self-Host for Free
CRW is AGPL-3.0 licensed. Run it on your own infrastructure at zero cost:
docker run -p 3000:3000 ghcr.io/us/crw:latest
View the source on GitHub · Read the docs
Hosted Path — Use fastCRW
Don't want to manage servers? fastCRW is the managed cloud version — same Firecrawl-compatible API, same low-latency engine, with proxy networks and auto-scaling handled for you. Start with 50 free credits, no credit card required.
Frequently Asked Questions
Is CRW fully Firecrawl API compatible?
Yes — CRW implements the same /v1/scrape, /v1/crawl, and /v1/map endpoints with identical request and response shapes. Any code using Firecrawl's REST API works with CRW after a one-line URL change. The official Firecrawl JS and Python SDKs also work without modification — just pass apiUrl (JS) or api_url (Python) pointing at your CRW host.
Can I migrate from Firecrawl to CRW without rewriting code?
In most cases, yes. Change the base URL and you're done. There are two caveats: CRW does not yet support screenshot capture or PDF/DOCX parsing, so if your workflow uses those features, you'll need to keep Firecrawl for those specific calls or route them through a separate service. For pure HTML scraping, crawling, and structured extraction, the migration is seamless.
How long does migration from Firecrawl to CRW take?
For a typical integration that uses scraping and crawling only, plan on 15 minutes — it's mostly changing a URL and API key in your config. If your workflow uses screenshots or PDFs that need alternative handling, budget more time to either find replacement tools or set up routing logic. The Firecrawl SDK changes are a single parameter addition in TypeScript or Python. The bigger time investment is validating that your existing tests still pass against CRW's responses, which are structurally identical but may differ in edge cases for very dynamic pages.
Can I run CRW and Firecrawl in parallel during migration?
Yes, and this is often the safest migration path. Because both expose the same REST interface, you can introduce a simple routing layer in your application that sends scraping requests to CRW and screenshot/PDF requests to Firecrawl. Once you've validated CRW's output quality for your specific URLs, you can gradually shift more traffic over. Running both simultaneously also gives you a concrete way to compare outputs side-by-side before fully cutting over.
Does CRW work for JavaScript-heavy sites?
CRW handles many JavaScript-heavy pages through its LightPanda integration, which provides headless browser rendering without Chromium's overhead. It's functional for most SPAs, but for highly complex client-side routing or login flows, Firecrawl's Playwright integration may be more reliable today. See our limitations post for the full picture.
Is CRW slower for JavaScript-heavy sites?
Honestly, yes — for complex SPAs that require full JavaScript execution, CRW's LightPanda path is generally slower than Firecrawl's Playwright path, and LightPanda is less mature. For the majority of pages — news articles, documentation, product listings, blog posts — CRW is significantly faster because it uses a streaming HTML parser rather than a browser. The 5.5× average speed advantage in our benchmarks is real, but it comes from that majority case. For sites that are genuinely JavaScript-dependent and have complex rendering, expect CRW to be comparable at best and potentially less reliable.
Is CRW free?
The self-hosted version of CRW is free under the AGPL-3.0 license — you can run unlimited requests at no cost on your own infrastructure. fastCRW, the hosted cloud version, has a free tier with 50 credits and paid plans for higher volumes.
What's the difference between CRW and fastCRW?
CRW is the open-source engine you run yourself. fastCRW is the managed cloud service built on top of CRW — it adds proxy networks, auto-scaling, and removes infrastructure management. The REST API is identical between both, which means code written for one works against the other without changes.
How does CRW's memory compare to Firecrawl at scale?
CRW idles at 6.6 MB of RAM vs Firecrawl's 500 MB+. Under 50 concurrent requests, CRW uses roughly 120 MB vs Firecrawl's 2 GB+. For teams running many parallel scraping workers, this difference directly translates to infrastructure cost. See our post on memory economics for a detailed breakdown.