Should I pre-crawl the docs or scrape live as a tool?

Pre-crawl plus RAG when the docs are large and stable and per-question latency must be low — you pay the cost once at crawl time. Use live tool-calling when the docs change constantly or the surface is small. The strongest agents combine both: a pre-indexed crawl for coverage and a live scrape tool to patch the one page that changed or that the index missed.

Can I use a Firecrawl-style scrape call as an agent tool?

Yes. fastCRW's REST surface is Firecrawl-compatible, so a scrape tool that wraps the Firecrawl SDK works against fastCRW by changing only the base URL — the request shape is unchanged. Each scrape is single-URL and costs 1 credit (any renderer). There is no multi-URL batch endpoint, so the tool fetches one page per call.

How do I keep the docs agent up to date?

fastCRW is stateless per request — it does not store your previous crawl, so you own the index and snapshot history. Re-crawl on a schedule (cron), diff the new markdown against stored pages, and re-embed only what changed. Between full re-crawls, use the live scrape tool to patch a single page that just changed.

Can I self-host the crawler for large docs sites?

Yes. The fastCRW engine is a single ~8 MB Rust binary in one container under AGPL-3.0, so you can run unlimited docs crawls for $0 — you pay only for your server (roughly $0 per 1,000 scrapes self-hosted). Many teams pre-index on a self-hosted binary and use managed cloud only for occasional live scrapes; compare projected credit spend against a small VPS at /pricing.

Build a Documentation Agent (fastCRW + OpenAI SDK)

By the fastCRW team · Benchmarks and credit costs verified 2026-05-18 · fastCRW launch pricing expires 2026-06-01 · Verify independently before relying on numbers.

Disclosure: we build fastCRW. This is a vendor-authored tutorial, so weigh the brand claims accordingly — every number below traces to a public benchmark, and we flag what fastCRW does not do as plainly as what it does.

Build a documentation agent with the OpenAI Agents SDK

A documentation agent answers questions over a docs site — "how do I rotate an API key?", "what changed in v3?" — by retrieving the right pages and grounding its answer in them. The hard part is never the model. It is ingestion: if the docs arrive as garbled HTML, the agent hallucinates. This tutorial builds a documentation agent with the OpenAI Agents SDK for orchestration and fastCRW for web retrieval, using clean LLM-ready markdown as the contract between them.

Why fastCRW for the retrieval leg? Accuracy. On Firecrawl's own public labeled dataset, fastCRW had the highest truth-recall of the three tools tested — 63.74% of 819 labeled URLs (diagnose_3way.py, 2026-05-08), ahead of Crawl4AI (59.95%) and Firecrawl (56.04%). A docs agent answers from whatever it ingested, so the page content arriving complete and clean is the difference between a citation and a guess. And because the engine is a single ~8 MB Rust binary under AGPL-3.0, you can self-host the crawler and ingest large docs sites with no per-page cloud bill.

Two ways to build a docs agent

There are two designs, and they win in different situations. Pick one before you write code.

Pre-index (crawl + RAG)

You crawl the whole docs site once, chunk and embed the markdown into a vector store, and the agent retrieves from that index at question time. Best when the docs are large and stable, latency per question must be low, and you want deterministic, repeatable retrieval. The cost is paid up front at crawl time, not per query.

Live tool-calling

You give the agent a scrape tool and let it fetch pages on demand during the conversation — classic ReAct-style tool loops. Best when the docs change constantly, the surface is small, or you want the agent to follow links it discovers mid-answer. The trade-off is latency and per-call cost on every turn.

In practice the strongest docs agents combine both: a pre-indexed crawl for breadth, plus a live scrape tool for the one page the index missed or that changed an hour ago. We build both legs below.

Step 1: Crawl the docs to clean markdown

Start by ingesting the docs site once with /v1/crawl, which runs an async BFS crawl and returns a job ID. Cap the breadth with maxDepth (cap 10) and maxPages (cap 1000) so a sprawling docs tree does not run away from you.

Why markdown beats raw HTML. Clean markdown strips nav chrome, scripts, and styling, so you embed only the content. That cuts token cost on every retrieval and produces stable chunks that do not flap when the site ships a cosmetic CSS change.
Renderer selection is automatic. fastCRW's auto renderer tries chrome → lightpanda → http and falls back, so JS-rendered docs frameworks (Docusaurus, Nextra, Mintlify) still produce text without you hand-tuning per site.
Cost. Crawl is 1 credit per page (any renderer). A 400-page docs site is roughly 400 credits — or $0 if you self-host the binary.

Kick off the crawl, poll GET /v1/crawl/:id until it finishes, then collect the markdown field from every result. That array of clean pages is the input to your index.

Step 2: Give the OpenAI Agent a scrape tool

Now wire the live leg. With the OpenAI Agents SDK you define a function tool and the model decides when to call it. The tool is a thin wrapper over /v1/scrape that returns markdown for one URL.

Tool contract: input is a single URL string; output is the cleaned markdown for that page. Keep the tool description tight ("Fetch the current content of a documentation page as markdown") so the agent calls it for freshness, not for everything.
Firecrawl-compatible calls. fastCRW's REST surface is Firecrawl-compatible, so if you already use the Firecrawl SDK inside the tool you only change the base URL — the request shape is the same. See the OpenAI Agents SDK + fastCRW guide for the full tool wiring.
Cost per call: a scrape is 1 credit (any renderer). Single-URL — there is no multi-URL batch endpoint, so the agent fetches one page per tool call.

Register both the retrieval-over-index function and the live scrape tool on the same agent. The model now has a fast path (the index) and an escape hatch (live scrape) and can choose between them.

Step 3: Agentic retrieval and answering

The OpenAI Agents SDK runs the ReAct loop for you: the agent reasons about the question, decides whether to query the index or scrape live, reads the returned markdown, and either answers or takes another step. A few things to get right:

Ground every answer. Instruct the agent to answer only from retrieved content and to say "not in the docs" when retrieval comes up empty. This is the single biggest lever on hallucination — see building a RAG pipeline with fastCRW for the retrieval patterns.
Citations. Carry the source URL through chunking and retrieval so the agent can cite the exact page. Clean markdown from Step 1 keeps the URL-to-content mapping intact.
Provider choice is yours. fastCRW supplies retrieval; the reasoning model is whatever you point the SDK at. Note that fastCRW's own server-side LLM extraction (the formats: ["json"] path) supports OpenAI and Anthropic only — but that constraint does not touch your agent's model choice, which lives entirely in the OpenAI Agents SDK.

Step 4: Keep docs current

Docs drift. A pre-indexed agent that crawled last month answers last month's docs. Two honest facts shape the refresh design:

fastCRW is stateless per request. It does not remember your previous crawl, so you own the index and the snapshot history. That is a feature for reproducibility — nothing changes under you — but it means re-crawling is your job.
Schedule re-crawls. Run the Step 1 crawl on a cron, diff the new markdown against your stored pages, and re-embed only what changed. See crawling an entire website from its sitemap for full-site crawl mechanics, and use the live scrape tool from Step 2 to patch a single just-changed page between full re-crawls.

For chunk sizing across a refreshed index, the chunking-strategies guide covers how to keep chunk boundaries stable so re-embedding only touches genuinely changed content.

Cost and where to self-host

Run the math before you commit to a design. For the pre-index leg, one full crawl of an N-page docs site costs N credits (any renderer). For the live leg, each tool call is 1 credit. A docs agent serving frequent questions over a large, frequently re-crawled site is exactly the workload where per-page metering adds up.

That is the case for self-hosting. The AGPL-3.0 engine is a single ~8 MB binary in one container (versus a multi-service stack), so you can run unlimited docs crawls for $0 — you pay only your server, $0 per 1,000 scrapes self-hosted. Compare your projected managed credit spend against a small VPS at /pricing and decide. Many teams pre-index on a self-hosted binary and only use managed cloud for the occasional live scrape.

Honest limits

So you scope the agent correctly:

No screenshot output. A request for formats: ["screenshot"] returns HTTP 422. If your docs answer depends on a rendered diagram image, fastCRW will not capture it — extract the surrounding text instead.
No batch extract. There is no multi-URL /v1/extract; the live tool fetches one URL per call. For breadth, crawl; for a single fresh page, scrape.
No managed agent harness. fastCRW has no /v1/agent and no /v1/deep-research. It is the retrieval primitive — the OpenAI Agents SDK is the harness. That separation is the design, not a gap to apologize for.
Plan for the latency tail. fastCRW's scrape p50 is a strong 1914 ms — the fastest of the three tested, beating Firecrawl's 2305 ms. In fast mode, the p90 is 4348 ms — the lowest of the three (Crawl4AI 4754 ms, Firecrawl 6937 ms). That chrome-stealth fallback is also what makes fastCRW's recall highest. For a live tool call, set your per-call timeout informed by the p90 rather than the median.

Where Firecrawl genuinely wins

If your docs agent needs cloud-only capabilities fastCRW does not offer, Firecrawl is the right call: its managed /v1/agent and deep-research endpoints, heavier fire-engine anti-bot for hardened sources, and a larger ecosystem of agent tutorials are real strengths. fastCRW's edge is accuracy on the retrieval leg, a single-binary self-host floor, and stateless reproducibility — not a broader managed surface.

Sources

fastCRW canonical facts (truth-recall, credit costs, footprint, honest gaps): github.com/us/crw · scrape benchmark diagnose_3way.py, 819 labeled URLs, 2026-05-08
OpenAI Agents SDK documentation: openai.github.io/openai-agents-python
see plan pricing and credit model: fastcrw.com/pricing