Skip to main content
Use Cases/Use Case / AI Agents

Web Scraping API for AI Agents

Give AI agents live web context via fastCRW — a Firecrawl-compatible scrape, search, crawl, and map API with an official MCP server, clean markdown output, and a single static Rust binary you can self-host free.

Published
March 11, 2026
Updated
June 13, 2026
Category
use cases
Verdict

fastCRW gives AI agent frameworks a single endpoint surface for web retrieval — scrape, search, map, and crawl via REST or MCP — with the highest truth-recall of the three tools benchmarked on Firecrawl's public 1,000-URL dataset (63.74%, `diagnose_3way.py`, 2026-05-08) and a single ~8 MB binary that runs without Redis, Node.js, or container orchestration.

MCP-native: official crw-mcp@0.6.0 server, stdio + HTTP transports63.74% truth-recall on Firecrawl's 1,000-URL dataset — highest of three tools benchmarkedREST + MCP in one binary — LangChain, CrewAI, LangGraph, Pydantic AI all supportedFirecrawl-compatible drop-in — swap the base URL, keep your agent tool definitionsp50 latency 1,914 ms on the 3-way benchmark (`diagnose_3way.py`, 2026-05-08)

Why AI Agents Need a Purpose-Built Web Retrieval Layer

AI agents are fundamentally different from batch scrapers. A nightly ETL job tolerates a 30-second scrape because it runs unattended. An agent calling a tool mid-reasoning loop cannot — every extra second of tool latency is a second the user waits, and it compounds across every step in the plan.

The consequences of a slow or inaccurate scraping layer are worse for agents than for pipelines:

  • Latency multiplies. A 4-second scrape inside a 10-step agent plan adds 40 seconds of wall-clock time. At p90 latency differences, the gap between tools can determine whether a product is usable or abandoned.
  • Hallucination fills accuracy gaps. When a scraper returns garbled HTML, a navigational page, or an empty result, the agent has two options: surface the failure, or fill the gap with a hallucination. Most models choose the latter. Higher truth-recall at the scraping layer means fewer gaps to fill.
  • Output format shapes token cost. Raw HTML returned to an agent wastes context tokens on nav bars, cookie banners, and script tags. Clean markdown means the agent reasons over content, not formatting noise.
  • Operational overhead hurts debugging. Multi-container scraping stacks add failure modes (Redis queue saturation, browser pool exhaustion, sidecar crashes) that are hard to diagnose inside an agent loop. A single binary with no dependencies is much easier to reason about.

fastCRW was designed with these constraints in mind: a single static Rust binary, a small predictable endpoint surface, clean markdown output, and the highest truth-recall of the three tools tested on Firecrawl's public 1,000-URL dataset (63.74%, diagnose_3way.py, 2026-05-08).

What fastCRW Gives an Agent

Agent needfastCRW endpointOutput
Discover relevant URLs by topicPOST /v1/searchRanked URL list with titles and snippets
Extract full page content from a URLPOST /v1/scrapeClean markdown (or structured JSON via formats: ["json"])
Discover all reachable pages on a domainPOST /v1/mapFlat URL list
Collect an entire site section recursivelyPOST /v1/crawlJob ID → async page collection
Expose all tools to an MCP-compatible hostPOST /mcp (or stdio)Native MCP tool surface

For most agent workloads, search and scrape are the only two tools needed. map is useful for agents that need to reason about a site's structure before choosing which pages to read. crawl is best run outside the agent loop (it's async), not as a blocking tool call.

Integration Paths

Path 1 — MCP (no REST wiring needed)

If your agent host supports MCP — Claude Code, Cursor, Windsurf, Continue, Zed, or a custom client built on the MCP SDKs — the fastest integration is the official crw-mcp server:

# Install and run (requires Node.js >= 18)
npx crw-mcp@0.6.0

# Or add to ~/.claude/config.json (Claude Code example)
{
  "mcpServers": {
    "fastcrw": {
      "command": "npx",
      "args": ["crw-mcp@0.6.0"],
      "env": { "FASTCRW_API_KEY": "YOUR_API_KEY" }
    }
  }
}

The MCP server exposes five tools: scrape, search, crawl, map, and extract. The agent host presents these as native tools the model can call — no function definitions, no schema writing.

For self-hosted deployments, the engine also accepts MCP over HTTP (POST /mcp with streamable HTTP transport) — useful when the fastCRW binary runs on a remote server and the agent connects over the network.

Path 2 — REST with curl

Direct HTTP is the baseline for custom agent systems:

# Search — discover relevant URLs for a topic
curl -X POST https://api.fastcrw.com/v1/search \
  -H "Authorization: Bearer $FASTCRW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "LangGraph multi-agent architecture patterns 2025",
    "limit": 5
  }'

# Scrape — extract full content from a known URL
curl -X POST https://api.fastcrw.com/v1/scrape \
  -H "Authorization: Bearer $FASTCRW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://langchain-ai.github.io/langgraph/concepts/multi_agent/",
    "formats": ["markdown"]
  }'

Path 3 — Python with a thin agent loop

import os
import requests
from typing import Any

FASTCRW_API_KEY = os.environ["FASTCRW_API_KEY"]
BASE_URL = "https://api.fastcrw.com/v1"

def search_web(query: str, limit: int = 5) -> list[dict]:
    """Tool: search the web for a topic and return ranked URLs + snippets."""
    response = requests.post(
        f"{BASE_URL}/search",
        headers={"Authorization": f"Bearer {FASTCRW_API_KEY}"},
        json={"query": query, "limit": limit},
        timeout=30,
    )
    response.raise_for_status()
    return response.json().get("data", [])


def scrape_url(url: str) -> str:
    """Tool: fetch full markdown content from a URL."""
    response = requests.post(
        f"{BASE_URL}/scrape",
        headers={"Authorization": f"Bearer {FASTCRW_API_KEY}"},
        json={"url": url, "formats": ["markdown"]},
        timeout=30,
    )
    response.raise_for_status()
    data = response.json()
    # Surface warnings so the agent knows when content is partial
    if warning := data.get("warning"):
        print(f"[fastCRW warning] {url}: {warning}")
    return data.get("data", {}).get("markdown", "")


def scrape_structured(url: str, schema: dict) -> dict:
    """Tool: extract structured JSON from a URL using LLM extraction (5 credits)."""
    response = requests.post(
        f"{BASE_URL}/scrape",
        headers={"Authorization": f"Bearer {FASTCRW_API_KEY}"},
        json={"url": url, "formats": ["json"], "jsonSchema": schema},
        timeout=60,
    )
    response.raise_for_status()
    return response.json().get("data", {}).get("json", {})


# --- Minimal research agent loop ---

def research_agent(question: str, max_steps: int = 4) -> str:
    """
    A simple search-scrape-synthesize agent loop.
    In production, replace the synthesis step with an LLM call.
    """
    findings: list[str] = []

    for step in range(max_steps):
        print(f"\n[Step {step + 1}] Searching: {question}")
        results = search_web(question, limit=3)

        if not results:
            print("  No results found. Stopping.")
            break

        for result in results[:2]:  # Scrape top 2 results per step
            url = result.get("url", "")
            title = result.get("title", url)
            print(f"  Scraping: {title[:60]} ({url[:60]})")

            content = scrape_url(url)
            if content:
                findings.append(f"## Source: {url}\n\n{content[:2000]}")

    return "\n\n---\n\n".join(findings)


if __name__ == "__main__":
    report = research_agent("what are the best practices for building multi-agent LangGraph workflows")
    print("\n=== RESEARCH REPORT ===")
    print(report[:3000])

Path 4 — TypeScript / Node.js

const FASTCRW_API_KEY = process.env.FASTCRW_API_KEY!;
const BASE_URL = "https://api.fastcrw.com/v1";

interface SearchResult {
  url: string;
  title: string;
  description: string;
}

interface ScrapeResult {
  markdown: string;
  warning?: string;
}

async function searchWeb(query: string, limit = 5): Promise<SearchResult[]> {
  const res = await fetch(`${BASE_URL}/search`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${FASTCRW_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ query, limit }),
  });
  if (!res.ok) throw new Error(`Search failed: ${res.status}`);
  const data = await res.json();
  return data.data ?? [];
}

async function scrapeUrl(url: string): Promise<ScrapeResult> {
  const res = await fetch(`${BASE_URL}/scrape`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${FASTCRW_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ url, formats: ["markdown"] }),
  });
  if (!res.ok) throw new Error(`Scrape failed: ${res.status} for ${url}`);
  const data = await res.json();
  return {
    markdown: data.data?.markdown ?? "",
    warning: data.warning,
  };
}

// OpenAI-compatible tool definitions for function calling
export const fastcrwTools = [
  {
    type: "function" as const,
    function: {
      name: "search_web",
      description:
        "Search the live web for a topic and return a ranked list of URLs with titles and snippets. Use this when you need to discover sources — not when you already have a URL.",
      parameters: {
        type: "object",
        properties: {
          query: {
            type: "string",
            description: "The search query",
          },
          limit: {
            type: "number",
            description: "Max results to return (default 5, max 10)",
          },
        },
        required: ["query"],
      },
    },
  },
  {
    type: "function" as const,
    function: {
      name: "scrape_url",
      description:
        "Fetch the full content of a URL and return clean markdown. Use this when you already have a specific URL to read.",
      parameters: {
        type: "object",
        properties: {
          url: {
            type: "string",
            description: "The URL to scrape",
          },
        },
        required: ["url"],
      },
    },
  },
];

// Dispatch tool calls from an OpenAI-compatible response
export async function dispatchToolCall(
  name: string,
  args: Record<string, unknown>
): Promise<string> {
  if (name === "search_web") {
    const results = await searchWeb(args.query as string, (args.limit as number) ?? 5);
    return JSON.stringify(results);
  }
  if (name === "scrape_url") {
    const result = await scrapeUrl(args.url as string);
    if (result.warning) {
      console.warn(`[fastCRW warning] ${args.url}: ${result.warning}`);
    }
    // Return only the markdown — not the full envelope
    return result.markdown;
  }
  throw new Error(`Unknown tool: ${name}`);
}

Framework Integration Matrix

FrameworkIntegration methodPackage / docs
LangChain (Python)langchain-crw package or FirecrawlLoader with api_url overrideLangChain integration
CrewAIcrewai-crw package or BaseTool subclassCrewAI integration
LangGraphLangChain tools wired into graph nodesLangGraph integration
Pydantic AIDirect httpx calls inside @tool functionsPydantic AI integration
OpenAI Agents SDKFunction tool definitions (see TypeScript snippet above)OpenAI Agents SDK integration
Claude Code / Cursor / WindsurfMCP via crw-mcp@0.6.0MCP integration
Custom agent (any language)REST — POST /v1/search + POST /v1/scrapePricing

All Firecrawl-based agent tools (LangChain's FirecrawlLoader, CrewAI's built-in Firecrawl tool, etc.) work with fastCRW after a single base-URL change:

# Before (Firecrawl)
loader = FirecrawlLoader(url=url, api_key=api_key)

# After (fastCRW — one line change)
loader = FirecrawlLoader(url=url, api_key=api_key, api_url="https://api.fastcrw.com")

LangChain Agent Example (Python)

from langchain_community.document_loaders import FirecrawlLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.tools import Tool
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_openai import ChatOpenAI
import requests, os

FASTCRW_API_KEY = os.environ["FASTCRW_API_KEY"]
FASTCRW_BASE = "https://api.fastcrw.com"

def fastcrw_search(query: str) -> str:
    """Search the web and return URL list with titles."""
    resp = requests.post(
        f"{FASTCRW_BASE}/v1/search",
        headers={"Authorization": f"Bearer {FASTCRW_API_KEY}"},
        json={"query": query, "limit": 5},
    )
    results = resp.json().get("data", [])
    return "\n".join(f"- {r['title']}: {r['url']}" for r in results)


def fastcrw_scrape(url: str) -> str:
    """Scrape a URL and return clean markdown content."""
    # Use the Firecrawl-compatible loader with the fastCRW base URL
    loader = FirecrawlLoader(
        url=url,
        api_key=FASTCRW_API_KEY,
        api_url=FASTCRW_BASE,
        mode="scrape",
    )
    docs = loader.load()
    return docs[0].page_content if docs else "No content retrieved."


tools = [
    Tool(
        name="search_web",
        func=fastcrw_search,
        description=(
            "Search the live web for a topic. Returns a ranked list of relevant URLs. "
            "Use this to discover sources, not to read a specific page."
        ),
    ),
    Tool(
        name="scrape_url",
        func=fastcrw_scrape,
        description=(
            "Fetch and extract the full text of a web page. Input must be a complete URL. "
            "Use this after search_web to read the content behind a URL."
        ),
    ),
]

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
agent = create_openai_tools_agent(llm, tools, prompt=None)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=6)

result = executor.invoke({
    "input": "What are the most popular open-source multi-agent frameworks in 2025?"
})
print(result["output"])

Structured Extraction for Agent Memory

Agents often need structured facts, not raw prose. Pass a jsonSchema to /v1/scrape to extract typed fields:

curl -X POST https://api.fastcrw.com/v1/scrape \
  -H "Authorization: Bearer $FASTCRW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://github.com/langchain-ai/langgraph",
    "formats": ["json"],
    "jsonSchema": {
      "type": "object",
      "properties": {
        "repo_name": { "type": "string" },
        "description": { "type": "string" },
        "stars": { "type": "number" },
        "main_use_cases": {
          "type": "array",
          "items": { "type": "string" }
        },
        "latest_version": { "type": "string" }
      },
      "required": ["repo_name", "description"]
    }
  }'

LLM extraction (formats: ["json"]) costs 5 credits per request (CANONICAL-FACTS.md §3, 2026-05-22). Plain markdown extraction costs 1 credit (http/lightpanda renderer) or 2 credits (Chrome renderer).

Site Mapping for Agent Planning

When an agent needs to reason about what's available on a domain before choosing what to scrape:

def map_domain_for_agent(domain: str, include_pattern: str | None = None) -> list[str]:
    """
    Discover all reachable URLs on a domain.
    Returns a flat list the agent can filter and prioritize.
    """
    payload: dict = {"url": domain}
    if include_pattern:
        payload["includePatterns"] = [include_pattern]

    resp = requests.post(
        f"{BASE_URL}/map",
        headers={"Authorization": f"Bearer {FASTCRW_API_KEY}"},
        json=payload,
    )
    resp.raise_for_status()
    return resp.json().get("urls", [])


# Example: agent discovers all documentation pages before deciding what to read
docs_urls = map_domain_for_agent(
    "https://docs.anthropic.com",
    include_pattern="*/tool-use/*"
)
print(f"Found {len(docs_urls)} tool-use documentation pages")
# Agent now selects the most relevant URLs to scrape rather than crawling blindly

Map costs 1 credit and returns the full URL inventory in seconds (CANONICAL-FACTS.md §3, 2026-05-22). It is far cheaper than crawling blindly when the agent only needs the URL list.

Benchmark: Why Accuracy Matters for Agent Quality

Agent systems amplify the accuracy of their tools. If a scraper returns garbled or truncated content, the agent's downstream reasoning is grounded in bad data — and the agent either propagates the error or hallucinates to cover the gap.

On Firecrawl's own public scrape-content-dataset-v1 (1,000 URLs, 819 carry labeled ground truth), the three-way benchmark produced (diagnose_3way.py, 2026-05-08):

ToolTruth-recall (of 819 labeled)Scrape-success (of 1,000)p50 latency
fastCRW63.74% (522 URLs)87.7% (877)1,914 ms
Crawl4AI59.95% (491 URLs)83.5% (835)1,916 ms
Firecrawl56.04% (459 URLs)89.7% (897)2,305 ms

fastCRW's +7.70 pp truth-recall advantage over Firecrawl means that for every 100 pages an agent scrapes, it gets accurate content on roughly 8 more pages. Over a typical agent workflow that touches dozens of sources, this compounds into meaningfully better grounded answers.

One honest disclosure: fastCRW's p90 latency (14,157 ms) is the highest of the three, because the Chrome-stealth fallback that recovers URLs the others miss is the same mechanism that creates the slow tail. For most agent tasks, the p50 win (1,914 ms vs 2,305 ms for Firecrawl) is the relevant figure. See the full benchmark for the complete p50/p90/p99 breakdown.

Production Considerations for Agent Deployments

Concurrency and rate limiting:

  • fastCRW handles concurrent requests — run parallel scrapes for multi-source agent tasks
  • Stay within your plan's concurrent-request limit; implement a semaphore if your agent spawns many parallel tool calls
  • For burst workloads, the Scale plan (1,000,000 credits/mo) provides headroom for agent workflows that make thousands of calls per day

Error handling in agent loops:

  • Treat HTTP 4xx/5xx as tool errors, not agent failures — surface them as tool results so the planner can retry or pivot
  • The warning field in scrape responses signals partial content; log it, don't suppress it
  • Implement exponential backoff for 429 (rate limit) responses; agents that retry immediately on rate limit will compound the problem

Cost estimation:

  • Plain markdown scrape: 1 credit (http/lightpanda) or 2 credits (Chrome) per URL
  • Structured extraction with formats: ["json"]: 5 credits per URL
  • Search query: 1 credit per query
  • For an agent that calls search once and scrapes 5 pages per task: ~6–11 credits per run
  • At 1,000 tasks/month: 6,000–11,000 credits, covered comfortably by the Hobby plan (3,000 credits/mo) to Standard plan (100,000 credits/mo) depending on volume

See pricing for current plan rates.

Self-hosting for cost control:

  • The AGPL-3.0 binary is free to run — you pay only your server
  • A single VPS (2 vCPU, 4 GB RAM) handles hundreds of concurrent agent scrapes
  • The Docker image is ~8 MB (CANONICAL-FACTS.md §7, 2026-05-22); total footprint with LightPanda is well under 100 MB
  • The API shape is identical to the managed cloud — switching between self-hosted and managed is a base-URL change

Good Fits vs Poor Fits

Good fits for fastCRW in agent pipelines:

  • Research agents that read public documentation, news, and open web sources
  • Competitive-intelligence agents monitoring competitor sites, pricing pages, and announcements
  • Copilots that answer questions by grounding in current web content
  • Agents that need MCP-native tool access inside Claude Code, Cursor, or Windsurf
  • Multi-agent frameworks (CrewAI, LangGraph) where a shared scraping tool reduces per-agent integration overhead

Poor fits (use a different tool):

  • Agents that need to log into authenticated web applications and maintain session state
  • Workflows where the primary task is complex multi-step browser automation (form filling, UI testing)
  • Agents that need screenshots — formats: ["screenshot"] is not supported; requests return HTTP 422

FAQ

Q: Why do AI agents need a different scraping stack than a batch ETL pipeline?

A: Agents make tool calls in tight reasoning loops — latency multiplies across every plan step, retry, and reflection pass. A scraping stack that adds 2–4 seconds per call inside a 10-step agent loop adds 20–40 seconds of wall-clock latency before the agent produces a result. fastCRW's p50 latency of 1,914 ms on the 3-way benchmark (diagnose_3way.py, 2026-05-08) is optimized for this use case. Beyond latency, agent systems also need predictable output shape (clean markdown, not raw HTML) so the model can reason over retrieved content without a parsing step.

Q: What is MCP and why does it matter for AI agents?

A: MCP (Model Context Protocol) is the standard for exposing tools and resources to LLM hosts. fastCRW ships an official MCP server (crw-mcp@0.6.0 on npm) that exposes scrape, search, crawl, map, and extract as MCP tools. Any MCP-compatible host — Claude Code, Cursor, Windsurf, Continue, or a custom agent built on the MCP SDK — can use fastCRW as a native tool without any REST wiring. Run npx crw-mcp@0.6.0 to start the server; it handles the stdio transport by default.

Q: How does fastCRW compare to Firecrawl for agent use cases?

A: fastCRW is Firecrawl-compatible at the API level — the same /v1/scrape, /v1/crawl, /v1/map, and /v1/search shape. The practical differences for agents: fastCRW reached 63.74% truth-recall on Firecrawl's own public 1,000-URL dataset versus Firecrawl's 56.04% (diagnose_3way.py, 2026-05-08), which means agents ground their answers in more accurate content. fastCRW also ships as a single ~8 MB binary you can self-host for free under AGPL-3.0, versus Firecrawl's multi-container stack. Existing agent tools written for Firecrawl need only a base-URL change to point at fastCRW.

Q: Which agent frameworks work with fastCRW out of the box?

A: LangChain (via the langchain-crw package or FirecrawlLoader with api_url override), CrewAI (via the crewai-crw package or a BaseTool subclass), LangGraph (direct HTTP calls or via LangChain tools), Pydantic AI (direct httpx calls), OpenAI Agents SDK (function tool wrapping /v1/scrape and /v1/search), and any framework that speaks MCP (via crw-mcp). For a custom agent, the minimal integration is two function tools wrapping POST /v1/search and POST /v1/scrape.

Q: Should I use scrape or search as the primary agent tool?

A: Both. Give the agent both tools and let the planner choose. Search is for topic discovery when the agent does not already have a target URL — it returns a ranked list of relevant URLs with snippets. Scrape is for content extraction when the agent has a URL and needs the full page. A common pattern: the agent calls search, picks the top 2–3 URLs, scrapes each one, and synthesizes the content. Do not force the agent to choose in advance — expose both and let it reason about which is appropriate.

Q: Can I self-host fastCRW to avoid cloud API costs for high-volume agent workloads?

A: Yes. fastCRW is AGPL-3.0 open source. The engine is a single static Rust binary (~8 MB Docker image) with no Redis, no Node.js, and no container orchestration required. You pay only your server costs. Self-hosting is the right choice when your agent makes thousands of tool calls per day and the per-credit cloud cost exceeds a small VPS. The API shape is identical to the managed cloud, so migrating between self-hosted and managed is a base-URL change.

Continue exploring

More from Use Cases

View all use cases

Related hubs

Keep the crawl path moving