Why Self-Host a Scraping API?

Three reasons developers self-host their scraping infrastructure:

Cost control: At scale, managed scraping APIs charge per request. Self-hosting with CRW has a fixed server cost and zero per-request fees.
Data privacy: URLs you scrape stay on your infrastructure. No third-party sees your data access patterns.
Customization: You control rate limits, auth, routing, and can integrate with your existing services.

The traditional barrier to self-hosting was complexity. Firecrawl's self-hosted version requires Node.js, Redis, Playwright, Chromium, and a compose file with multiple services. CRW removes that barrier entirely.

Prerequisites

A Linux server with Docker installed (any cloud provider, $5/month DigitalOcean works)
Port 3000 open in your firewall (or any port you prefer)

Deploy in 60 Seconds

docker run -d \
  --name crw \
  --restart unless-stopped \
  -p 3000:3000 \
  ghcr.io/us/crw:latest

That's it. CRW is now running. Test it:

curl -X POST http://your-server:3000/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "formats": ["markdown"]}'

Adding API Key Authentication

For production, add an API key to prevent unauthorized access:

docker run -d \
  --name crw \
  --restart unless-stopped \
  -p 3000:3000 \
  -e CRW_API_KEY=your-secret-key \
  ghcr.io/us/crw:latest

Now all requests require the Authorization: Bearer your-secret-key header. Requests without it return 401.

Using with an Nginx Reverse Proxy

For HTTPS and a cleaner URL, put Nginx in front:

server {
    listen 443 ssl;
    server_name scraper.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/scraper.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/scraper.yourdomain.com/privkey.pem;

    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 120s;
    }
}

Using docker-compose

For a more maintainable setup:

version: "3.8"
services:
  crw:
    image: ghcr.io/us/crw:latest
    restart: unless-stopped
    ports:
      - "3000:3000"
    environment:
      - CRW_API_KEY=${CRW_API_KEY}
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Migrate from Firecrawl in One Line

If you're already using Firecrawl, changing your base URL is the entire migration:

// Before: Firecrawl cloud
const client = new FirecrawlApp({
  apiKey: "fc-...",
  apiUrl: "https://api.firecrawl.dev",
});

// After: Your self-hosted CRW
const client = new FirecrawlApp({
  apiKey: "your-crw-api-key",
  apiUrl: "https://scraper.yourdomain.com",
});

Same SDK, same method calls, same response format. Zero code changes beyond the URL.

What CRW Handles Out of the Box

POST /v1/scrape — Single page to markdown, HTML, or JSON
POST /v1/crawl — Multi-page site crawl with depth/limit controls
GET /v1/crawl/:id — Crawl status polling
POST /v1/map — Site URL discovery
GET /health — Health check endpoint

JavaScript rendering for SPAs is available via CRW's LightPanda integration — no separate browser configuration required.

Resource Requirements

Scenario	RAM	CPU	Recommended VPS
Light usage (<10 req/min)	~50 MB	Low	$5/mo (1 vCPU, 1 GB)
Medium (<100 req/min)	~200 MB	Medium	$12/mo (2 vCPU, 2 GB)
High (>100 req/min)	~500 MB	High	$24/mo+ (4 vCPU, 4 GB)

These are substantially smaller requirements than Firecrawl's self-hosted stack, which recommends 4 GB minimum for stable operation.

Monitoring and Logs

# View live logs
docker logs -f crw

# Check resource usage
docker stats crw

CRW logs each request with method, URL, status code, and latency. Structured JSON logging is available with the --log-format json flag.

When to Use fastCRW Instead

Self-hosting is ideal when you want full control and predictable costs. But if you need:

Managed proxy rotation for blocked sites
Auto-scaling for spiky traffic
Zero maintenance overhead

…then fastCRW — the hosted version — is the better choice. You get the same API, same response format, and 500 free credits to start without a credit card.

How to Self-Host a Firecrawl-Like API with a Single Binary

Why Self-Host a Scraping API?

Prerequisites

Deploy in 60 Seconds

Adding API Key Authentication

Using with an Nginx Reverse Proxy

Using docker-compose

Migrate from Firecrawl in One Line

What CRW Handles Out of the Box

Resource Requirements

Monitoring and Logs

When to Use fastCRW Instead

Try CRW Free

More tutorial posts

Verify a Firecrawl Drop-In Replacement: Smoke Test

Port a TypeScript Scraper to Python: Skip the Rewrite

Migrating from Scrapy to fastCRW: A Practical Guide (2026)