Integrations/Integration / Go

Go Web Scraping API — fastCRW [Firecrawl-Compatible]

Scrape, crawl, and search from Go with fastCRW — a Firecrawl-compatible REST API backed by a single Rust binary. Worker pools, errgroup, rate limiting, per-request context timeouts. AGPL-3.0, self-host free.

Published

June 13, 2026

Updated

June 13, 2026

Verdict

fastCRW is Firecrawl-compatible REST — POST /v1/scrape, get back clean Markdown. Go has no official SDK, and it does not need one: net/http plus encoding/json is the complete client. What makes Go a strong fit is the concurrency story — compose errgroup.SetLimit(n), a per-host rate.Limiter, and per-request context.WithTimeout around /v1/scrape calls, and the engine handles rendering and extraction while your Go service handles concurrency. Under the hood: a single ~8 MB Rust binary that deploys alongside your Go service as just another binary.

Who This Is For

Go developers who want clean Markdown from the web — without parsing raw HTML or running a headless browser.
Teams building high-throughput scrapers — you want bounded concurrency with proper error propagation and rate limiting.
Backend services that need web enrichment — a Go API that calls fastCRW to enrich records on demand.
Self-hosting shops — a static Rust binary is a natural fit alongside Go services; AGPL-3.0, $0 per 1,000 scrapes.

Setup

1. Start the engine

Local (Docker):

docker run -p 3000:3000 ghcr.io/us/crw:latest
curl -s http://localhost:3000/health

Or use the managed cloud:

export FASTCRW_BASE_URL="https://api.fastcrw.com"
export FASTCRW_API_KEY="fcrw_..."

2. Build the client

Create a scraper package with a shared *http.Client. A default-transport client re-opens connections per request and quietly bottlenecks a high-concurrency scraper:

// scraper/client.go
package scraper

import (
    "net/http"
    "time"
)

// NewHTTPClient returns an *http.Client tuned for concurrent scraping.
// A single instance should be shared across all workers.
func NewHTTPClient() *http.Client {
    return &http.Client{
        Timeout: 0, // we use per-request context timeouts instead
        Transport: &http.Transport{
            MaxIdleConnsPerHost: 32,
            IdleConnTimeout:     90 * time.Second,
        },
    }
}

Quickstart: Scrape a Page

package main

import (
    "bytes"
    "context"
    "encoding/json"
    "fmt"
    "net/http"
    "os"
    "time"
)

const (
    baseURL    = "https://api.fastcrw.com"
    timeoutSec = 25 // set above recall-mode p90; in fast mode p90 is 4348 ms (2026-05-08 benchmark)
)

type scrapeRequest struct {
    URL             string   `json:"url"`
    Formats         []string `json:"formats"`
    OnlyMainContent bool     `json:"onlyMainContent"`
}

type scrapeResponse struct {
    Success bool `json:"success"`
    Data    struct {
        Markdown string `json:"markdown"`
        Metadata struct {
            StatusCode int    `json:"statusCode"`
            Title      string `json:"title"`
        } `json:"metadata"`
    } `json:"data"`
    Error string `json:"error,omitempty"`
}

func scrapeURL(ctx context.Context, client *http.Client, apiKey, url string) (string, error) {
    ctx, cancel := context.WithTimeout(ctx, timeoutSec*time.Second)
    defer cancel()

    body, _ := json.Marshal(scrapeRequest{
        URL:             url,
        Formats:         []string{"markdown"},
        OnlyMainContent: true,
    })

    req, err := http.NewRequestWithContext(ctx, http.MethodPost, baseURL+"/v1/scrape", bytes.NewReader(body))
    if err != nil {
        return "", err
    }
    req.Header.Set("Authorization", "Bearer "+apiKey)
    req.Header.Set("Content-Type", "application/json")

    resp, err := client.Do(req)
    if err != nil {
        return "", err
    }
    defer resp.Body.Close()

    var result scrapeResponse
    if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
        return "", err
    }
    if !result.Success {
        return "", fmt.Errorf("scrape failed: %s", result.Error)
    }
    return result.Data.Markdown, nil
}

func main() {
    client := &http.Client{
        Transport: &http.Transport{MaxIdleConnsPerHost: 32},
    }
    apiKey := os.Getenv("FASTCRW_API_KEY")

    md, err := scrapeURL(context.Background(), client, apiKey, "https://docs.fastcrw.com")
    if err != nil {
        panic(err)
    }
    fmt.Printf("scraped %d chars\n", len(md))
}

Concurrent Batch Scraping with errgroup

errgroup.WithContext + SetLimit(n) is the idiomatic Go bounded fan-out: g.Go() blocks when all slots are occupied, so the limit IS your concurrency cap. The first non-nil error cancels the shared context so all siblings stop wasting work:

package main

import (
    "context"
    "fmt"
    "net/http"
    "os"
    "sync"

    "golang.org/x/sync/errgroup"
)

func batchScrape(ctx context.Context, client *http.Client, apiKey string, urls []string, concurrency int) ([]string, error) {
    results := make([]string, len(urls))
    var mu sync.Mutex

    g, gctx := errgroup.WithContext(ctx)
    g.SetLimit(concurrency) // blocks g.Go() until a slot is free

    for i, url := range urls {
        i, url := i, url // capture loop variables
        g.Go(func() error {
            md, err := scrapeURL(gctx, client, apiKey, url)
            if err != nil {
                return fmt.Errorf("url %s: %w", url, err)
            }
            mu.Lock()
            results[i] = md
            mu.Unlock()
            return nil
        })
    }

    if err := g.Wait(); err != nil {
        return nil, err
    }
    return results, nil
}

func main() {
    client := &http.Client{
        Transport: &http.Transport{MaxIdleConnsPerHost: 32},
    }
    apiKey := os.Getenv("FASTCRW_API_KEY")

    urls := []string{
        "https://docs.fastcrw.com",
        "https://fastcrw.com/pricing",
        "https://fastcrw.com/alternatives/firecrawl",
    }

    markdowns, err := batchScrape(context.Background(), client, apiKey, urls, 4)
    if err != nil {
        panic(err)
    }
    for i, md := range markdowns {
        fmt.Printf("%s: %d chars\n", urls[i], len(md))
    }
}

Install errgroup: go get golang.org/x/sync/errgroup

Latency note: fastCRW's p50 was 1914 ms — the fastest of three tools — on the 2026-05-08 benchmark (819 labeled URLs, diagnose_3way.py). In fast mode, fastCRW p90 is 4348ms — the lowest of the three (below Crawl4AI 4754ms and Firecrawl 6937ms). The chrome-stealth fallback in recall mode recovers hard URLs others drop — the same mechanism behind fastCRW's highest truth-recall of three tools (63.74%). Size your timeoutSec for the mode you use. Full breakdown at /benchmarks/firecrawl-dataset.

Per-Host Rate Limiting

Bounded concurrency limits how many requests overlap; rate limiting limits how many start per second. You need both — a pool of N workers can still hammer a single host at N requests the instant they all finish:

package main

import (
    "context"
    "net/url"
    "sync"

    "golang.org/x/time/rate"
)

// HostLimiter is a thread-safe per-host rate limiter.
type HostLimiter struct {
    mu       sync.RWMutex
    limiters map[string]*rate.Limiter
    r        rate.Limit // requests per second per host
    b        int        // burst size
}

func NewHostLimiter(rps float64, burst int) *HostLimiter {
    return &HostLimiter{
        limiters: make(map[string]*rate.Limiter),
        r:        rate.Limit(rps),
        b:        burst,
    }
}

func (h *HostLimiter) Wait(ctx context.Context, rawURL string) error {
    u, err := url.Parse(rawURL)
    if err != nil {
        return err
    }
    host := u.Hostname()

    h.mu.RLock()
    limiter, ok := h.limiters[host]
    h.mu.RUnlock()

    if !ok {
        h.mu.Lock()
        if limiter, ok = h.limiters[host]; !ok {
            limiter = rate.NewLimiter(h.r, h.b)
            h.limiters[host] = limiter
        }
        h.mu.Unlock()
    }

    return limiter.Wait(ctx) // respects context cancellation
}

Call hostLimiter.Wait(ctx, url) inside each worker before the scrape request. The Wait method respects context cancellation so a shutdown drains gracefully rather than hanging.

Install rate: go get golang.org/x/time/rate

Crawl a Whole Site

For BFS traversal of an entire domain, use /v1/crawl — it returns a job ID to poll, handles deduplication and politeness, and is bounded by maxDepth (cap 10) and maxPages (cap 1000). Do not reimplement BFS with a goroutine pool and a visited-set when the engine already provides it:

package main

import (
    "bytes"
    "context"
    "encoding/json"
    "fmt"
    "net/http"
    "os"
    "time"
)

type crawlStartResponse struct {
    ID string `json:"id"`
}

type crawlStatusResponse struct {
    Status string `json:"status"`
    Data   []struct {
        Markdown string `json:"markdown"`
        Metadata struct {
            SourceURL string `json:"sourceURL"`
        } `json:"metadata"`
    } `json:"data"`
}

func crawlSite(ctx context.Context, client *http.Client, apiKey, seedURL string, maxPages, maxDepth int) (*crawlStatusResponse, error) {
    apiKey = os.Getenv("FASTCRW_API_KEY")
    base := "https://api.fastcrw.com"
    headers := map[string]string{
        "Authorization": "Bearer " + apiKey,
        "Content-Type":  "application/json",
    }

    startBody, _ := json.Marshal(map[string]interface{}{
        "url":      seedURL,
        "limit":    maxPages,
        "maxDepth": maxDepth,
        "scrapeOptions": map[string]interface{}{
            "formats":         []string{"markdown"},
            "onlyMainContent": true,
        },
    })

    startReq, _ := http.NewRequestWithContext(ctx, http.MethodPost, base+"/v1/crawl", bytes.NewReader(startBody))
    for k, v := range headers {
        startReq.Header.Set(k, v)
    }

    startResp, err := client.Do(startReq)
    if err != nil {
        return nil, err
    }
    defer startResp.Body.Close()

    var start crawlStartResponse
    json.NewDecoder(startResp.Body).Decode(&start)

    // Poll until complete
    for {
        pollReq, _ := http.NewRequestWithContext(ctx, http.MethodGet, base+"/v1/crawl/"+start.ID, nil)
        for k, v := range headers {
            pollReq.Header.Set(k, v)
        }

        pollResp, err := client.Do(pollReq)
        if err != nil {
            return nil, err
        }

        var status crawlStatusResponse
        json.NewDecoder(pollResp.Body).Decode(&status)
        pollResp.Body.Close()

        if status.Status == "completed" {
            return &status, nil
        }
        time.Sleep(2 * time.Second)
    }
}

func main() {
    client := &http.Client{Transport: &http.Transport{MaxIdleConnsPerHost: 16}}
    result, err := crawlSite(context.Background(), client, os.Getenv("FASTCRW_API_KEY"), "https://docs.fastcrw.com", 25, 3)
    if err != nil {
        panic(err)
    }
    for _, page := range result.Data {
        fmt.Printf("%d chars  %s\n", len(page.Markdown), page.Metadata.SourceURL)
    }
}

Web Search

package main

import (
    "bytes"
    "context"
    "encoding/json"
    "fmt"
    "net/http"
    "os"
)

func search(ctx context.Context, client *http.Client, query string, limit int) ([]map[string]interface{}, error) {
    body, _ := json.Marshal(map[string]interface{}{
        "query": query,
        "limit": limit,
    })

    req, _ := http.NewRequestWithContext(ctx, http.MethodPost, "https://api.fastcrw.com/v1/search", bytes.NewReader(body))
    req.Header.Set("Authorization", "Bearer "+os.Getenv("FASTCRW_API_KEY"))
    req.Header.Set("Content-Type", "application/json")

    resp, err := client.Do(req)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var result struct {
        Data []map[string]interface{} `json:"data"`
    }
    json.NewDecoder(resp.Body).Decode(&result)
    return result.Data, nil
}

func main() {
    client := &http.Client{}
    results, err := search(context.Background(), client, "go web scraping api 2026", 5)
    if err != nil {
        panic(err)
    }
    for _, r := range results {
        fmt.Println(r["title"], "→", r["url"])
    }
}

MCP Setup

fastCRW ships an MCP server (crw-mcp on npm) for AI agents that need live web data from Go-based tools:

{
  "mcpServers": {
    "fastcrw": {
      "command": "npx",
      "args": ["-y", "crw-mcp@latest"],
      "env": {
        "FASTCRW_API_KEY": "fcrw_...",
        "FASTCRW_API_URL": "https://api.fastcrw.com"
      }
    }
  }
}

See /integrations/mcp for full configuration options.

Self-Hosting Next to Your Go Service

The engine is a single ~8 MB static Rust binary — no Redis, no Node.js, no multi-container stack. For Go teams that already deploy static binaries, "the scrape backend is just another binary next to ours" is an easy operational fit:

# docker-compose.yml (excerpt)
services:
  crw:
    image: ghcr.io/us/crw:latest
    ports: ["3000:3000"]

  go-scraper:
    build: .
    environment:
      FASTCRW_BASE_URL: "http://crw:3000"
      FASTCRW_API_KEY: ""  # not required for self-hosted
    depends_on: [crw]

Self-hosting the AGPL-3.0 engine is $0 per 1,000 scrapes — you pay only your server. See /pricing for managed cloud tiers.

Good to Know

No official Go SDK, and you don't need one — net/http + encoding/json is the whole client (shown above): no external dependency, full control over the transport, and it slots directly into the concurrency patterns below.
Screenshots — formats: ["screenshot"] is served by /v2/scrape.
Batch scraping — fan out g.Go() calls over /v1/scrape, use /v1/crawl for site-wide jobs, or call /v2/batch/scrape directly for a single-request batch.
Stateless per request — no persistent session or cookie jar across calls; pass any auth/session state your target needs on each request.
LLM extraction — powered by the managed LLM, available on paid plans (the FREE plan has no LLM features).

fastCRWlive

Scrape any URL, live

Get 500 free credits →

Sources

fastCRW scrape docs

/docs/scrape

fastCRW 3-way scrape benchmark

/benchmarks/firecrawl-dataset

fastCRW search docs

/docs/search

golang.org/x/sync/errgroup

https://pkg.go.dev/golang.org/x/sync/errgroup

FAQ

Does fastCRW have an official Go SDK?

No, and you do not need one. The API is Firecrawl-compatible REST — POST /v1/scrape with a JSON body and a Bearer token, unmarshal the response. net/http and encoding/json are the whole client. This is arguably the better fit for Go: no external dependency, full control over the transport, and it slots directly into the concurrency patterns below.

How do I limit goroutine concurrency for web scraping in Go?

Use golang.org/x/sync/errgroup with g.SetLimit(n) — g.Go() blocks until a slot frees, so the limit IS the concurrency cap. Alternatively use a fixed worker pool reading from a jobs channel, or a weighted semaphore (golang.org/x/sync/semaphore) acquired before each request. Never spawn an unbounded goroutine per URL: sockets and the remote site's patience are the constrained resources, not goroutines.

Why does p90 latency matter more than p50 for Go scrapers?

When you fan out many requests, wall-clock time is gated by the slowest requests in each batch. On the 2026-05-08 3-way benchmark (diagnose_3way.py), fastCRW's p50 was 1914 ms — the fastest of three tools. In fast mode, fastCRW p90 is 4348ms — the lowest of three. Manage it with per-request context timeouts (20–25 s is sane for recall mode) and modest pool oversubscription so chrome-stealth recoveries are not killed prematurely.

How do I rate-limit a Go scraper per host?

Keep a map[string]*rate.Limiter (golang.org/x/time/rate) guarded by a sync.RWMutex and call limiter.Wait(ctx) before each request. Per-host is the correct granularity — a global limiter throttles you against fast hosts to protect slow ones. Honor robots.txt Crawl-delay by feeding it into the limiter, and use exponential backoff with jitter on 429/503 responses.

When should I use /v1/crawl instead of fanning out /v1/scrape goroutines?

Fan out /v1/scrape goroutines when you have a known list of URLs. Use POST /v1/crawl when you have a seed URL and a boundary and want breadth-first traversal, deduplication, and politeness handled for you. /v1/crawl returns a job ID you poll via GET /v1/crawl/{id} and cancel via DELETE /v1/crawl/{id}, honoring maxDepth (cap 10) and maxPages (cap 1000). Reimplementing BFS with a goroutine pool and a visited-set just rebuilds what /v1/crawl already provides.

Recommended next step

Run a live scrape before you commit.

Use the hosted demo to test scrape, crawl, or map output with fastCRW semantics.

Try Playground

Continue exploring

More from Integrations

View all integrations

Previous in Integrations

Python Web Scraping API — fastCRW [Firecrawl-Compatible]

Next in Integrations

Make Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Integrations

MCP Web Scraping Integration — fastCRW [Firecrawl-Compatible]

fastCRW ships an official MCP server (crw-mcp) exposing scrape, search, crawl, map, and extract to any MCP-compatible client. Small single static binary, local-first, self-host free under AGPL-3.0.

mcp web scrapingOfficial crw-mcp server, single npx command

Integrations

Google ADK Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Wire fastCRW into Google's Agent Development Kit as a FunctionTool. Firecrawl-compatible scrape and search, small single static binary, local-first, self-host free under AGPL-3.0.

google adk web scrapingFunctionTool wrappers for fastCRW scrape and search

Integrations

OpenAI Agents SDK Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Give OpenAI Agents SDK agents a fastCRW scrape and search tool with the @function_tool decorator. Small single static binary, local-first, Firecrawl-compatible API, self-host free under AGPL-3.0.

openai agents sdk web scrapingNative function_tool wrappers for scrape and search

Related hubs

Keep the crawl path moving

Docs

Drop into endpoint reference once your integration is wired up.

Use Cases

See where this integration shape fits common AI-agent workloads.

Alternatives

Compare fastCRW against other scraping APIs your stack might consider.