How do I add web access to a smolagents code agent?

Write a Python function decorated with smolagents' @tool that POSTs to fastCRW's Firecrawl-compatible REST API — /v1/scrape for a known URL, /v1/search for discovery — and returns the cleaned markdown. Pass the tool(s) to a CodeAgent and the agent will call them as ordinary Python during its loop. The whole integration is a few lines; no SDK or extra runtime is required.

How big is the fastCRW footprint?

fastCRW's engine is a single statically-linked Rust binary — roughly an 8 MB Docker image running in one container, with no Redis, Node.js, or browser farm required for the common case. The repo README frames this as a structural fact versus a heavier scraper stack of about 2–3 GB across five containers. For a minimalist framework like smolagents, that keeps the stack lean.

Can smolagents run fastCRW locally with no cloud?

Yes. The crw Python SDK exposes CrwClient(), which runs a self-contained local engine with no API key and no network egress. Call it from inside your @tool function and scraped content never leaves your machine. The engine is AGPL-3.0, so self-hosting is free — you pay only for the server it runs on.

Does fastCRW give clean markdown or raw HTML?

Request formats: ['markdown'] and fastCRW returns the main content as clean, LLM-ready markdown with nav chrome, scripts, and boilerplate stripped — which saves tokens and gives the code agent better context than raw HTML. JSON-schema structured extraction is also available via formats: ['json'] (the 1-credit scrape plus the LLM token cost as usage-metered LLM credits, single URL, run on the fastCRW managed LLM on paid plans).

Is fastCRW more accurate than Firecrawl at extraction?

On the one public benchmark we ran, yes: against Firecrawl's own scrape-content-dataset-v1, fastCRW posted the highest truth-recall of the three tools tested — 63.74% of 819 labeled URLs versus Crawl4AI 59.95% and Firecrawl 56.04% (diagnose_3way.py, 2026-05-08). That is a single run on one dataset, not a universal guarantee. fastCRW's scrape success was 91.8% of reachable URLs with 0 errors. In fast mode, fastCRW's p90 is 4348 ms — the lowest of the three.

Smolagents + fastCRW: Web Grounding, Zero Bloat

By the fastCRW team · Benchmark figures verified 2026-05-18 against the 2026-05-08 run · Verify independently before quoting internally.

Smolagents + fastCRW: web grounding without the bloat

If you reached for Hugging Face smolagents, you did it on purpose: it is a deliberately tiny agent framework — a few thousand lines, code agents that write and run Python rather than emit JSON blobs, minimal dependencies. The fastest way to ruin that is to bolt a multi-gigabyte web-data service onto the side of it. This guide wires smolagents to fastCRW for web search and scraping while keeping the whole stack lean: fastCRW is a single ~8 MB AGPL-3.0 Rust binary running in one container, it exposes a Firecrawl-compatible REST API you can reach with a base-URL swap, and it posts the highest truth-recall of the three scrapers we benchmarked (63.74% of 819 labeled URLs, diagnose_3way.py, 2026-05-08).

Disclosure: we build fastCRW. This is a vendor-authored tutorial, so weight it accordingly.

Smolagents' minimalist philosophy and the web

Code agents that call tools as Python

A smolagents CodeAgent does not pick tools from a menu of JSON schemas; it writes Python that calls your tools as ordinary functions, runs that code, observes the result, and iterates. A tool is just a Python callable decorated with @tool and a docstring. That means your web layer should look like a normal function that returns clean text — not a heavyweight SDK with its own runtime, queue, and browser pool. fastCRW fits that shape: one HTTP call, markdown back.

Why a lean web backend fits the smolagents ethos

The smolagents pitch is that you can read the whole framework in an afternoon and run it anywhere. A web-data dependency that needs five containers and a couple of gigabytes of RAM breaks that promise — your "tiny agent" now drags a platform-team-sized stack behind it. fastCRW is the opposite: a single statically-linked binary, no Redis, no Node.js, no browser farm required for the common case. The README labels the footprint as a structural fact (one ~8 MB binary / 1 container vs Firecrawl's ~2–3 GB across 5 containers), not a benchmark, so it holds regardless of load.

Write a fastCRW tool for smolagents

A @tool function calling the REST API

fastCRW speaks a Firecrawl-compatible REST surface, so the call is a plain POST /v1/scrape. Point it at your managed endpoint (https://fastcrw.com) or a locally self-hosted engine — the only difference is the base URL. Here is the whole tool:

import requests and the smolagents @tool decorator.
Read the base URL and key from environment so the same tool works against cloud or local.
Return data.markdown — clean, LLM-ready text — and nothing else.

In code:

from smolagents import tool, CodeAgent, InferenceClientModel
import os, requests

BASE = os.environ.get("CRW_BASE_URL", "https://fastcrw.com")
KEY = os.environ["CRW_API_KEY"]

@tool
def scrape_page(url: str) -> str:
  """Fetch a web page and return its main content as clean markdown.
  Args:
    url: The absolute URL to scrape."""
  r = requests.post(f"{BASE}/v1/scrape",
    headers={"Authorization": f"Bearer {KEY}"},
    json={"url": url, "formats": ["markdown"]}, timeout=30)
  r.raise_for_status()
  return r.json()["data"]["markdown"]

That is the entire integration. Because fastCRW mirrors Firecrawl's request shape, anyone already calling Firecrawl from a smolagents tool can switch by changing BASE — no rewrite. If you prefer the Python SDK over raw requests, the crw package (PyPI) exposes CrwClient() and can run a self-contained local engine, which we use below for the zero-cloud variant.

Returning clean markdown to the code agent

The reason to return markdown rather than raw HTML is that the code agent will pass this string straight into the model's context. HTML burns tokens on tags, scripts, and nav chrome the model has to ignore; fastCRW's extraction strips the page down to the article body. The accuracy of that strip is exactly what truth-recall measures (see below) — and it directly decides how much of the real content your agent gets to reason over.

Adding /v1/search for discovery

A research agent usually does not start with a URL — it starts with a question. Add a second tool over /v1/search so the agent can discover URLs before scraping:

@tool
def web_search(query: str) -> str:
  """Search the web and return the top result URLs and snippets.
  Args:
    query: A natural-language search query."""
  r = requests.post(f"{BASE}/v1/search",
    headers={"Authorization": f"Bearer {KEY}"},
    json={"query": query, "limit": 5}, timeout=30)
  r.raise_for_status()
  return "\n".join(f"{x['url']} — {x.get('description','')}" for x in r.json()["data"])

Search costs 1 credit per query; the agent can then feed any returned URL to scrape_page. For a Python-side getting-started walkthrough of these endpoints, see the Python scraping quickstart.

Zero-bloat infrastructure

Single ~8 MB AGPL-3.0 binary, 1 container

fastCRW's engine is one statically-linked Rust binary — no Redis, no Node.js, no separate worker tier. The Docker image is roughly 8 MB and runs as a single container (the default Compose ships the lightweight lightpanda renderer; chrome is opt-in). Compare that to a scraper stack that wants an API service, a worker pool, a queue, a datastore, and a browser runtime — five containers and a couple of gigabytes. For a framework whose whole identity is "small," that footprint difference is the point. We unpack it further in single-binary infra and low-memory scraping.

Self-host locally with the Python SDK crw

If you want the web layer to cost $0 and never leave your machine, skip the cloud entirely. The crw Python SDK runs a self-contained local engine, so your smolagents tool can call it without any external service:

from crw import CrwClient
client = CrwClient() # runs a local engine, no API key, no egress

@tool
def scrape_local(url: str) -> str:
"""Scrape a URL locally and return markdown."""
return client.scrape(url, formats=["markdown"]).markdown

The engine is AGPL-3.0, so self-hosting is free — you pay only for the box it runs on, and a $5 VPS is plenty for a single-agent workload.

Footprint vs a heavy multi-container stack

Dimension	fastCRW	Typical heavy scraper
Docker image	single ~8 MB binary	~2–3 GB total
Containers	1 (+ optional sidecar)	5
Runtime deps	none (static Rust)	Node.js, queue, datastore, browser
Local mode	yes — `CrwClient()`	cloud-only or heavy compose

These are structural facts from the repo README, not load-test numbers — they describe what each system is, not how it performed on a given day.

A worked example: a research code agent

Search, scrape, summarize loop

Wire both tools into a CodeAgent and the agent will compose them itself — the framework's whole appeal is that you do not script the loop, the model writes Python that does:

agent = CodeAgent(tools=[web_search, scrape_page],
model=InferenceClientModel())
answer = agent.run("Summarize the latest changes in the Rust 2024 edition.")

Internally the agent will typically call web_search, pick a couple of promising URLs, call scrape_page on each, and synthesize an answer from the markdown — all as generated Python, which is exactly what smolagents is built to run.

Iterating URLs since requests are stateless

fastCRW is stateless per request: there is no session that remembers the last page or carries cookies between calls. For a research agent that is usually fine — each scrape is independent — but it means you own the loop. If the agent needs five pages, it makes five scrape_page calls, or one /v1/extract call with all five URLs for batched structured extraction (more on that below). For crawling a whole site rather than hand-picked pages, use /v1/crawl, which walks the site and bills 1 credit per page.

Accuracy and latency, disclosed

Highest truth-recall of the three tools tested

Against Firecrawl's own public scrape-content-dataset-v1 (1,000 URLs, 819 of them carrying labeled ground truth), fastCRW recovered the most labeled content of the three scrapers measured: 63.74% truth-recall (522 of 819 labeled URLs), versus Crawl4AI's 59.95% and Firecrawl's 56.04% (diagnose_3way.py, single run of 3,000 requests, 2026-05-08). We pair that with the honest companions from the same run: 91.8% scrape success of reachable URLs and 0 thrown errors across all 3,000 requests. The 34 URLs only fastCRW recovers represent 70% more unique coverage than the other two combined. For a code agent, recall is the number that matters — content the scraper missed is content the model never sees, and the answer degrades silently.

p50 win, p90 tail honesty

fastCRW's median scrape latency was 1914 ms, beating Firecrawl's 2305 ms and effectively tied with Crawl4AI (1916 ms). In fast mode, fastCRW's p90 is 4348 ms — the lowest of the three (Crawl4AI 4754 ms, Firecrawl 6937 ms). Search is an even faster story: fastCRW search averaged 880 ms over a 100-query benchmark, with 73 of 100 latency wins against Firecrawl and Tavily (triple-bench.ts). The full p50/p90/p99 split is published on /benchmarks.

What to know before you wire this up

No /v1/agent harness — by design

fastCRW gives you scrape, crawl, map, search, and a research endpoint (/v2/search/research/papers for paper-level research primitives) — not an autonomous agent loop. There is no /v1/agent. That is by design here: smolagents is your agent harness, so fastCRW only needs to be the web layer, and you compose any multi-step research loop in smolagents itself.

Extraction, screenshots, and multi-URL batching

/v1/extract accepts up to 50 URLs in a single request for batched structured extraction — billed as the scrape credit per URL plus the LLM token cost as usage-metered LLM credits, not a flat rate. Self-hosters can also call /v1/scrape + jsonSchema directly, or iterate scrape_page concurrently across a URL list. LLM-based JSON extraction runs on fastCRW's managed LLM, the same managed model behind search answer mode — there is no key or provider to configure, and it is available on paid plans only. Screenshots are supported too: a formats: ["screenshot"] request returns data.screenshot as a base64 PNG data URL.

Sources

fastCRW canonical fact sheet — internal benchmark of record (bench/server-runs/RESULT_3WAY_1000_FULL.md, diagnose_3way.py, 2026-05-08; benchmarks/triple-bench.ts, 100 queries).
fastCRW open-source engine and README: github.com/us/crw (AGPL-3.0).
Hugging Face smolagents documentation: github.com/huggingface/smolagents.
Live pricing and credit costs: /pricing.