Docs/Docs / Self-Hosting

Self-Hosting Guide

Deploy fastCRW with a single binary or container when you want infrastructure control and a straightforward self-hosted setup.

Published
March 11, 2026
Updated
March 11, 2026
Category
docs
Single-binary or container deploymentOptional browser sidecarWorks well on small servers and private networks

Quick Start

docker run -p 3000:3000 ghcr.io/us/crw:latest

That gives you a local endpoint quickly so you can validate real targets before designing a larger deployment.

What You Get

The self-hosted path is useful when you want to:

  • keep target traffic inside your own infrastructure,
  • control runtime cost directly,
  • and expose scrape, crawl, and map behind your own auth, network, and observability stack.

The API shape stays familiar whether you use the managed cloud or your own deployment.

Recommended Workflow

  1. Boot the service locally or on a small VPS.
  2. Validate target URLs with the scrape, map, and crawl routes.
  3. Add LightPanda only when your workload requires browser-backed rendering.
  4. Put a reverse proxy, auth, and rate limits in front of it before exposing it beyond a trusted environment.

Early Validation Checklist

Before calling the deployment production-ready, test:

  • a simple static page through scrape,
  • a JS-heavy page with renderJs: true,
  • a small map request,
  • a bounded crawl request,
  • and failure cases such as invalid selectors or target-side 403 responses.

That gives you a much clearer operational picture than only testing a single happy-path URL.

What This Setup Is Good At

This setup is a good fit when you want to keep traffic inside your own infrastructure, control costs closely, or ship a private scraping service without managing a large crawler platform. If you need public, managed capacity instead, use the hosted product and keep the same API shape.

When Not To Self-Host

Choose the managed product instead if:

  • you want immediate capacity without operating the service,
  • your team does not want to manage browser dependencies,
  • or the main goal is product velocity rather than infrastructure control.

Self-hosting is valuable when ownership matters. It is not automatically the best default for every team.