By the fastCRW team · Search latency measured with triple-bench.ts over 100 queries, single point-in-time run · Numbers are one run, not a guarantee · Verify independently before quoting internally.
Disclosure: We build fastCRW. This is a vendor-authored benchmark, so weight it accordingly — we ran fastCRW, Firecrawl, and Tavily through the same harness and we report the percentile where Firecrawl beats us, not just the ones we win.
The web search API latency benchmark in one line
On a web search API latency benchmark of 100 queries run concurrently against all three providers (triple-bench.ts, single point-in-time run), fastCRW search averaged 880 ms versus Firecrawl's 954 ms and Tavily's 2,000 ms, and took 73 of 100 per-query latency wins. This is the search path only — /v1/search, not /v1/scrape — and we deliberately keep the two benchmarks separate because they measure different work. If you came here for scrape latency, the section below points you to the right number.
The 100-query search latency benchmark
100 queries, 10 categories, run concurrently
The harness issues 100 distinct queries spread across 10 categories (so no single topic skews the result), fired concurrently rather than one-at-a-time, against each provider's live search endpoint. Concurrency matters: a serial loop hides the queueing behavior that real agent workloads actually hit. Every provider saw the identical query set in the same run, which is what makes the per-query win count meaningful instead of a comparison of two different test days.
triple-bench.ts, all three providers, same queries
One script, three providers, one run. We did not average across multiple runs or cherry-pick a fast window — this is a single point-in-time measurement, and we label it as such because search-endpoint latency moves with upstream index load and time of day. Treat it as a snapshot of one run, not a permanent SLA. The honest way to read any vendor's search benchmark, including ours, is to ask for the harness, the query count, and the date — all three are here.
Average and median latency results
| Metric | fastCRW | Firecrawl | Tavily |
|---|---|---|---|
| Average latency | 880 ms | 954 ms | 2,000 ms |
| Median latency | 785 ms | 932 ms | 1,724 ms |
| P95 latency | 1,433 ms | 1,343 ms | 3,534 ms |
| Latency wins (of 100) | 73 | 25 | 2 |
| Success rate | 100% | 100% | 100% |
fastCRW 880 ms avg vs Firecrawl 954 ms vs Tavily 2,000 ms
The average tells the headline story: fastCRW returned results in 880 ms on average, Firecrawl in 954 ms, and Tavily in 2,000 ms. We are giving raw milliseconds on purpose — not a "lowest-latency search API" speed multiple. A ratio compresses a real distribution into one marketing number, and the whole point of this page is that the distribution is what you should plan against.
Median: fastCRW 785 ms, Firecrawl 932 ms, Tavily 1,724 ms
The median (p50) is the more honest "typical request" number because it ignores the occasional slow outlier that drags an average up. fastCRW's median of 785 ms means half the queries came back under that — comfortably sub-second. Firecrawl's median was 932 ms and Tavily's was 1,724 ms. The gap between fastCRW's median (785 ms) and average (880 ms) is small, which tells you the distribution is reasonably tight rather than hiding a long tail.
Latency wins and the P95 tail
73 of 100 latency wins for fastCRW
Per query, fastCRW was the lowest-latency provider on 73 of the 100 queries; Firecrawl won 25 and Tavily won 2. The win count is a different lens from the average: it tells you how often, query by query, you would have gotten the lowest-latency answer. Winning most individual queries while also having the best average and median is the consistent-not-lucky signal — a single very fast query cannot manufacture a 73/100 win count.
P95: fastCRW 1,433 ms vs Firecrawl 1,343 ms (we concede this)
Here is where Firecrawl beats us: at the 95th percentile, Firecrawl was 1,343 ms versus fastCRW's 1,433 ms — a 90 ms edge in Firecrawl's favor. We report it because a benchmark that only shows the percentiles you win is advertising, not measurement. If your workload is dominated by tail latency on a small fraction of slowest queries, that 90 ms is a real point for Firecrawl. Tavily's P95 was 3,534 ms, well behind both. Note the relationship: fastCRW leads on average, median, and win count, while Firecrawl edges the far tail — those are not contradictory, they describe different parts of the same curve.
100% success across all three
All three providers returned successful results on 100% of the 100 queries in this run. Latency is the differentiator here, not reliability — none of the three dropped a query. That is worth stating plainly so the latency comparison is not mistaken for a reliability comparison; on this run, every provider was reliable.
Why search latency is a separate benchmark from scrape
This measures /v1/search, not /v1/scrape
Search and scrape are different endpoints doing different work, so we never blend their numbers. /v1/search queries an index and returns results (optionally with content); /v1/scrape fetches and renders a single page, which can involve a headless browser. Mixing them would produce a meaningless "scraping speed" figure. The 880 ms average on this page applies to search only.
Where to find the scrape numbers instead
For scrape latency, the canonical number is the 3-way benchmark on Firecrawl's public 819-labeled-URL dataset (diagnose_3way.py, 2026-05-08): fastCRW's scrape p50 is 1914 ms (fastest), beating Firecrawl's 2305 ms, and in fast mode its p90 is 4348 ms — the lowest of the three (Crawl4AI 4754 ms, Firecrawl 6937 ms). The chrome-stealth fallback that recovers content others miss (driving the highest truth-recall, 63.74% of 819 labeled URLs) is the same mechanism behind the fast mode p90 win. That p90 belongs to scrape, not search; the search path measured here has no such fallback step. See scraping latency explained and the full benchmarks page for the scrape distribution.
What this means for agent loops
Sub-second median search inside reasoning
A web search step inside an agent's reasoning loop is usually synchronous — the model waits for results before continuing. A 785 ms median means most search calls add under a second to each loop iteration, which is the difference between an agent that feels responsive and one that visibly stalls. Because the distribution is tight (median 785 ms, average 880 ms), you can budget around roughly one second per search and only occasionally pay the P95 (~1.4 s). If you need a deeper primer on search inside agents, see search API for AI agents and the web context layer for AI agents.
Answer mode adds an LLM call on top of search
One caveat that changes the latency math: if you enable answer synthesis (answer: true) or per-result summarization on /v1/search, you add an LLM call on top of the search latency above. On fastCRW's managed cloud (paid plans), that call runs a managed LLM, metered in credits based on usage; bring-your-own-key is available on every plan including Free if you prefer OpenAI or Anthropic. The 880 ms average is the raw search latency — synthesis is additional, and its cost depends on the model you route to. Plan your latency budget for "search + optional LLM," not search alone, when answer mode is on.
How to read this benchmark honestly
Three guardrails we hold ourselves to, and that you should demand of any search-latency claim:
- It is one run. 100 queries, single point-in-time. Search latency moves with upstream load; re-run it on your own query mix before quoting it as an SLA.
- Raw numbers, not multiples. We say 880 ms vs 954 ms vs 2,000 ms, never a "lowest-latency search API" ratio — a multiple hides the distribution that actually determines your loop latency.
- We concede the percentile we lose. Firecrawl's P95 (1,343 ms) beats ours (1,433 ms). If your workload lives in the tail, that is a genuine point for Firecrawl.
Where Firecrawl and Tavily genuinely win
An honest comparison has to name where the other tools are the better pick:
- Firecrawl — tail latency and ecosystem. Firecrawl edged fastCRW at P95 (1,343 ms vs 1,433 ms) in this run, and it remains the category reference with far more tutorials and community examples. If tail-bounded latency or ecosystem gravity is your binding constraint, Firecrawl is a defensible choice.
- Tavily — answer-first design and maturity. Tavily was the slowest on raw latency here (2,000 ms average), but it was built answer-first for RAG and has a longer track record as a dedicated search API. If you want an opinionated answer-synthesis product rather than a fast search primitive you compose yourself, Tavily's shape may fit better than its latency suggests. See our CRW vs Tavily search benchmark and the three-way Exa vs Tavily vs Firecrawl search comparison for the fuller picture.
Sources
- Search benchmark harness and results:
benchmarks/triple-bench.ts— 100 queries, 10 categories, single point-in-time run. fastCRW 880 ms avg / 785 ms median / 1,433 ms P95 / 73 wins; Firecrawl 954 / 932 / 1,343 / 25; Tavily 2,000 / 1,724 / 3,534 / 2; 100% success all three. - Scrape benchmark referenced for contrast:
diagnose_3way.py, Firecrawl publicscrape-content-dataset-v1(819 labeled URLs), 2026-05-08. - fastCRW repo and pricing: github.com/us/crw · fastcrw.com · live pricing.
Related: CRW vs Tavily search benchmark · Exa vs Tavily vs Firecrawl · Search API for AI agents
