Integrations/Integration / LangChain

LangChain Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Wire LangChain document loaders into fastCRW with a single api_url override. Same Firecrawl-compatible API, 6.6 MB RAM runtime, 92% coverage on the 1,000-URL benchmark.

Published

April 29, 2026

Updated

April 29, 2026

Why LangChain + fastCRW

LangChain is the dominant orchestration layer for retrieval pipelines and agent tools. fastCRW slots underneath as the scraping primitive that keeps the LangChain stack honest about latency and memory. The LangChain community already standardized around the Firecrawl document loader interface, and fastCRW is Firecrawl-compatible by design — so plugging fastCRW into a LangChain project is a one-line change. You keep every chain, retriever, and agent loop you already wrote, and you replace the heavy Firecrawl runtime with a 6.6 MB binary that hits 92% coverage at 833 ms average latency on our 1,000-URL benchmark.

Setup

Install LangChain and the community loaders package.
Sign up at fastcrw.com and grab an API key from the dashboard.
Export the key as FASTCRW_API_KEY in your shell or .env file.
Point the existing Firecrawl loader at the fastCRW base URL via the api_url argument.

pip install -U langchain langchain-community
export FASTCRW_API_KEY="fcrw_..."

You do not need a separate fastCRW LangChain package. The standard FirecrawlLoader already accepts a custom api_url because the fastCRW endpoints are wire-compatible with Firecrawl.

Code Example

import os
from langchain_community.document_loaders import FirecrawlLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

# fastCRW is Firecrawl-compatible. Override api_url and the rest is identical.
loader = FirecrawlLoader(
    api_key=os.environ["FASTCRW_API_KEY"],
    api_url="https://api.fastcrw.com",
    url="https://example.com/blog",
    mode="scrape",  # or "crawl"
)

docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks = splitter.split_documents(docs)

print(f"Loaded {len(docs)} document(s) from fastCRW")
print(f"Split into {len(chunks)} chunks for the LangChain vector store")

For a LangChain agent tool that calls fastCRW for ad-hoc scraping inside a reasoning loop:

from langchain_core.tools import tool
import requests

@tool
def fastcrw_scrape(url: str) -> str:
    """Scrape a URL via fastCRW and return the Markdown."""
    response = requests.post(
        "https://api.fastcrw.com/v1/scrape",
        headers={"Authorization": f"Bearer {os.environ['FASTCRW_API_KEY']}"},
        json={"url": url, "formats": ["markdown"]},
        timeout=60,
    )
    response.raise_for_status()
    return response.json()["data"]["markdown"]

When to Use This

RAG ingestion — feed LangChain vector stores from live URLs without standing up a separate scraping service.
LangChain agents that browse — give an agent a fastcrw_scrape tool so it can fetch arbitrary pages mid-reasoning.
Document loaders for evals — run fastCRW inside LangChain pipelines to build evaluation datasets from real web content.
Migrating from Firecrawl — keep the LangChain code unchanged and swap only the API base URL to cut runtime cost.

Limits + Gotchas

The FirecrawlLoader mode argument supports "scrape" and "crawl". For deep-crawl jobs, prefer fastCRW crawl with explicit maxDepth to keep token spend bounded.
LangChain document metadata is derived from the fastCRW response. If you depend on a specific Firecrawl metadata field that we have not yet shipped, file an issue.
LangChain JS uses the @langchain/community package. The same apiUrl override applies, but the field names follow camelCase.
Long-running crawls inside a LangChain agent loop can blow the agent's iteration budget. Run crawls outside the agent and pass results back through context.

Sources

LangChain FirecrawlLoader docs

https://python.langchain.com/docs/integrations/document_loaders/firecrawl

fastCRW scrape docs

/docs/scrape

FAQ

Does fastCRW work with the existing LangChain Firecrawl loader?

Yes. Pass api_url='https://api.fastcrw.com' to FirecrawlLoader and the loader calls fastCRW instead of Firecrawl with no other changes.

Can fastCRW return Markdown for LangChain text splitters?

Yes. fastCRW returns clean Markdown by default which feeds directly into RecursiveCharacterTextSplitter and embedding chains.

How do I use fastCRW as a LangChain agent tool?

Wrap the fastCRW scrape or search call with the @tool decorator and let the LangChain agent invoke it like any other tool.

Recommended next step

Run a live scrape before you commit.

Use the hosted demo to test scrape, crawl, or map output with fastCRW semantics.

Try Playground

Continue exploring

More from Integrations

View all integrations

Previous in Integrations

CrewAI Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Integrations

Make Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Add fastCRW to Make scenarios with the HTTP module. Firecrawl-compatible scrape and search, 6.6 MB RAM runtime, 92% coverage on the 1,000-URL benchmark.

make web scrapingWorks with Make's built-in HTTP > Make a request module

Integrations

Langflow Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Add fastCRW to Langflow as a custom component or HTTP node. Firecrawl-compatible scrape and search, 6.6 MB RAM runtime, 92% coverage on the 1,000-URL benchmark.

langflow web scrapingCustom component definition for reusable fastCRW nodes

Integrations

Claude Code Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Add fastCRW as a Claude Code MCP server. One npx command registers scrape, search, crawl, map, and extract tools. 6.6 MB RAM runtime, 92% coverage on the 1,000-URL benchmark.

claude code web scrapingOne-command MCP server registration

Related hubs

Keep the crawl path moving

Docs

Drop into endpoint reference once your integration is wired up.

Use Cases

See where this integration shape fits common AI-agent workloads.

Alternatives

Compare fastCRW against other scraping APIs your stack might consider.