CrewAI Web Scraping Integration — fastCRW [Firecrawl-Compatible]
Give every CrewAI agent a fastCRW scrape and search tool — via the official crewai-crw package or a hand-rolled BaseTool subclass. A single static Rust binary, local-first, Firecrawl-compatible for low-friction migration.
Install the official crewai-crw package for ready-made fastCRW tools, or define one BaseTool subclass and share it across every CrewAI agent and crew that needs live web context.
Verdict
CrewAI organizes multi-agent workflows around explicit roles, tasks, and tools — and almost every crew needs at least one agent that pulls live web context. fastCRW is the scraping tool for that agent: install the official crewai-crw package for ready-made tools, or define one BaseTool subclass and share it across the whole crew. fastCRW returns clean Markdown so the agent's context window stays focused on reasoning rather than HTML cleanup, and because it is Firecrawl-compatible on the scrape, crawl, map, and search surface, a CrewAI tool written for Firecrawl ports over with a single base-URL change.
Who This Is For
- Teams building research crews — one agent searches, another scrapes the top results, a third synthesizes a report.
- Sales and growth automations — a CrewAI agent enriches leads by scraping company sites before drafting outreach.
- Monitoring crews — a scheduled crew scrapes competitor pages, pricing, or changelogs on a cron.
- Firecrawl users on CrewAI — you want lower runtime cost and a smaller image without redesigning the crew.
Setup
1. Install CrewAI and crewai-crw
pip install -U crewai crewai-crw
The recommended path is the official crewai-crw package, which provides ready-made fastCRW tools. If you would rather see exactly what the tool does — or you need custom request parameters — hand-rolling a BaseTool subclass is straightforward and shown below.
2. Provision a fastCRW API key
Sign up at fastcrw.com, copy the API key, and export it where the CrewAI process can read it:
export FASTCRW_API_KEY="fcrw_..."
The free tier ships 500 one-time lifetime credits. A scrape is 1 credit, search is 1 credit per query, and crawl is 1 credit per page — predictable enough to budget a recurring crew.
Defining fastCRW Tools
A BaseTool subclass wraps an HTTP call to fastCRW and exposes it to any agent. Define a scrape tool and a search tool once, then reuse them across the crew:
import os
import requests
from typing import Type
from pydantic import BaseModel, Field
from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
FASTCRW_BASE = "https://api.fastcrw.com"
HEADERS = {"Authorization": f"Bearer {os.environ['FASTCRW_API_KEY']}"}
class FastCRWScrapeInput(BaseModel):
url: str = Field(..., description="The URL to scrape.")
class FastCRWScrapeTool(BaseTool):
name: str = "fastcrw_scrape"
description: str = "Scrape a single URL via fastCRW and return clean Markdown. Use when you have a known page to read."
args_schema: Type[BaseModel] = FastCRWScrapeInput
def _run(self, url: str) -> str:
try:
r = requests.post(
f"{FASTCRW_BASE}/v1/scrape",
headers=HEADERS,
json={"url": url, "formats": ["markdown"], "onlyMainContent": True},
timeout=60,
)
r.raise_for_status()
return r.json()["data"]["markdown"]
except requests.HTTPError as e:
# Return a string so the agent can recover instead of crashing the crew.
return f"fastCRW scrape failed for {url}: {e}"
class FastCRWSearchInput(BaseModel):
query: str = Field(..., description="Web search query.")
class FastCRWSearchTool(BaseTool):
name: str = "fastcrw_search"
description: str = "Search the live web via fastCRW. Returns ranked results with URLs. Use when you need to find sources."
args_schema: Type[BaseModel] = FastCRWSearchInput
def _run(self, query: str) -> list[dict]:
r = requests.post(
f"{FASTCRW_BASE}/v1/search",
headers=HEADERS,
json={"query": query, "limit": 5},
timeout=60,
)
r.raise_for_status()
return r.json()["data"]
A Complete Research Crew
With the tools defined, compose a two-agent crew: a researcher that searches and scrapes, and a writer that turns findings into a brief.
scrape = FastCRWScrapeTool()
search = FastCRWSearchTool()
researcher = Agent(
role="Web research analyst",
goal="Find and read primary sources via fastCRW",
backstory="Skilled at distilling raw web pages into structured findings.",
tools=[search, scrape],
)
writer = Agent(
role="Technical writer",
goal="Turn research findings into a clear, cited brief",
backstory="Writes concise summaries that always attribute their sources.",
)
research_task = Task(
description="Research the current open-source web scraping APIs for AI agents.",
expected_output="A bulleted list of tools with one-line descriptions and source URLs.",
agent=researcher,
)
write_task = Task(
description="Write a 200-word brief from the research findings.",
expected_output="A 200-word brief with inline source links.",
agent=writer,
)
crew = Crew(agents=[researcher, writer], tasks=[research_task, write_task])
result = crew.kickoff()
print(result)
The same shared scrape and search instances can be handed to any number of agents — the API key, base URL, and any caching you add live in one place.
Why fastCRW Under CrewAI
A research crew is only as good as the pages its agents actually read. On Firecrawl's own public 1,000-URL scrape-content-dataset-v1, scored by the open diagnose_3way.py harness on 2026-05-08, fastCRW recovered the labeled content on 63.74% of 819 labeled URLs — ahead of Crawl4AI (59.95%) and Firecrawl (56.04%). Median latency was 1914 ms (p50), in the same band as Crawl4AI (1916 ms) and Firecrawl (2305 ms); fastCRW publishes the full p50/p90/p99 split, including a p90 of 14157 ms that is the honest cost of the chrome-stealth fallback recovering hard pages. The full table and a one-command repro are on the 1,000-URL benchmark page.
Limits + Gotchas
- CrewAI tools run synchronously by default. fastCRW crawl is async and returns a job ID — prefer
scrapeandsearchfor in-loop work, and run crawls as a separate step. - Tool errors raised inside
_runpropagate up and can abort the crew. Catch HTTP errors and return an explanatory string so the agent can recover, as the scrape tool above does. - The
descriptionfield on each tool is load-bearing — CrewAI agents pick tools by reading it. State precisely when fastCRW should be used. - CrewAI does not deduplicate tool calls. If a crew scrapes the same URLs repeatedly, add a cache at the application layer (a dict keyed by URL is enough to start).
- Structured extraction (
formats: ["json"]with ajsonSchema) costs 5 credits. Use plain Markdown scraping unless you genuinely need schema-shaped output.
Related
Continue exploring
More from Integrations
Migrate from Firecrawl to fastCRW in 10 Minutes — Drop-In Guide
Migrate from Jina Reader to fastCRW — URL-to-Markdown Upgrade Guide
Migrate from Tavily to fastCRW — Search API Migration Guide
Migrate from Tavily search API to fastCRW POST /v1/search. fastCRW search averaged 880 ms across a 100-query benchmark, and adds scrape, crawl, and map. Param mapping table, before/after code, and honest gaps (answer mode: managed on paid plans or bring-your-own-key; no domain filters).
Cursor Web Scraping Integration — fastCRW [Firecrawl-Compatible]
Add fastCRW as an MCP server in Cursor IDE. Configure ~/.cursor/mcp.json, then scrape, search, crawl, and map web pages from within your agent prompts. A single static Rust binary, local-first.
Claude Code Web Scraping Integration — fastCRW [MCP Server]
Add fastCRW as a Claude Code MCP server. One npx command registers scrape, search, crawl, map, and crawl-status tools. A single static Rust binary, local-first, self-host free under AGPL-3.0.
Related hubs
