Skip to main content
Integrations/Integration / Vercel AI SDK

Vercel AI SDK Web Scraping Integration — fastCRW [Firecrawl-Compatible]

Register fastCRW as a tool in Vercel AI SDK so generateText and streamText can scrape live web pages. Drop-in alternative to Firecrawl with 6.6 MB RAM runtime and 833 ms average latency on 1,000-URL benchmark.

Published
May 12, 2026
Updated
May 12, 2026
Category
integrations
Verdict

Register fastCRW as a tool in Vercel AI SDK with a two-minute setup. Both generateText and streamText can invoke scrape, crawl, and search operations — the same way they call any LLM tool. fastCRW outputs clean Markdown ready for LLM context windows.

Register fastCRW scrape/crawl/search as native Vercel AI SDK tools via tool() helperWorks identically with generateText and streamText for both blocking and streaming flowsZero SDK changes needed — pure REST API integration6.6 MB RAM fastCRW binary, 833 ms average latency on production benchmark

Why Vercel AI SDK + fastCRW

Vercel AI SDK is the TypeScript-first toolchain for building AI applications with Next.js, Svelte, and plain Node.js. The tool-calling feature lets models invoke arbitrary functions — perfect for web scraping. fastCRW integrates as a REST API tool, giving your LLM the ability to fetch and understand live web pages without standing up a separate scraping service.

The standard pattern: the model decides it needs to research a topic, calls your fastCRW tool, receives Markdown, and reasons about it inline. Unlike integrating Firecrawl directly, fastCRW runs as a 6.6 MB binary that deploys anywhere — your laptop, a serverless function, a container, or fastcrw.com. The Vercel AI SDK doesn't care where fastCRW lives; it just makes HTTP calls.

Setup

  1. Install Vercel AI SDK in your Next.js or Node.js project.
  2. Sign up at fastcrw.com and grab an API key.
  3. Set FASTCRW_API_KEY in your .env.local file.
  4. Define your fastCRW tools using the Vercel AI SDK tool() helper.
  5. Pass tools to generateText or streamText.
npm install ai zod
export FASTCRW_API_KEY="fcrw_..."

Code Example: Scrape Tool Registration

Define a fastCRW scrape tool in your API route:

import { generateText, tool } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

// Register the fastCRW scrape tool
const fastcrwScrape = tool({
  description:
    "Scrape a single URL via fastCRW and return Markdown content",
  parameters: z.object({
    url: z.string().url("Must be a valid URL"),
    formats: z
      .array(z.enum(["markdown", "html", "json"]))
      .optional()
      .default(["markdown"]),
  }),
  execute: async ({ url, formats }) => {
    const response = await fetch("https://fastcrw.com/api/v1/scrape", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.FASTCRW_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        url,
        formats,
      }),
    });

    if (!response.ok) {
      throw new Error(
        `fastCRW scrape failed: ${response.statusText}`
      );
    }

    const result = await response.json();
    return result.data.markdown || result.data.html;
  },
});

// Register the fastCRW search tool
const fastcrwSearch = tool({
  description: "Search the web via fastCRW and return top results",
  parameters: z.object({
    query: z.string(),
    limit: z.number().min(1).max(10).optional().default(5),
  }),
  execute: async ({ query, limit }) => {
    const response = await fetch("https://fastcrw.com/api/v1/search", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.FASTCRW_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        query,
        limit,
      }),
    });

    if (!response.ok) {
      throw new Error(`fastCRW search failed: ${response.statusText}`);
    }

    const result = await response.json();
    return JSON.stringify(result.data.results, null, 2);
  },
});

// Use both tools with generateText
export async function POST(request: Request) {
  const { userMessage } = await request.json();

  const result = await generateText({
    model: openai("gpt-4o-mini"),
    tools: {
      scrape: fastcrwScrape,
      search: fastcrwSearch,
    },
    system:
      "You are a research assistant. Use fastCRW tools to fetch live web content and answer questions based on current information.",
    messages: [
      {
        role: "user",
        content: userMessage,
      },
    ],
  });

  return Response.json({
    content: result.text,
  });
}

Streaming Example with streamText

For real-time streaming responses where the model calls fastCRW mid-stream:

import { streamText, tool } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const fastcrwScrape = tool({
  description: "Scrape a URL and return clean Markdown",
  parameters: z.object({
    url: z.string().url(),
  }),
  execute: async ({ url }) => {
    const response = await fetch("https://fastcrw.com/api/v1/scrape", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.FASTCRW_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ url, formats: ["markdown"] }),
    });

    const data = await response.json();
    return data.data.markdown;
  },
});

export async function POST(request: Request) {
  const { userMessage } = await request.json();

  const stream = streamText({
    model: openai("gpt-4o-mini"),
    tools: {
      scrape: fastcrwScrape,
    },
    system:
      "Fetch live web content using fastCRW when needed to answer questions accurately.",
    messages: [
      {
        role: "user",
        content: userMessage,
      },
    ],
  });

  return stream.toDataStreamResponse();
}

API Route Example (Next.js App Router)

Create app/api/research/route.ts:

import { generateText, tool } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const scrapeUrl = tool({
  description: "Fetch and parse a web page",
  parameters: z.object({
    url: z.string().url(),
  }),
  execute: async ({ url }) => {
    const res = await fetch("https://fastcrw.com/api/v1/scrape", {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.FASTCRW_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ url }),
    });
    const data = await res.json();
    return data.data.markdown;
  },
});

export async function POST(request: Request) {
  const { topic } = await request.json();

  const response = await generateText({
    model: openai("gpt-4o-mini"),
    tools: { scrapeUrl },
    system:
      "Research topics by scraping relevant web pages. Be thorough and cite sources.",
    messages: [
      {
        role: "user",
        content: `Research this topic and summarize findings: ${topic}`,
      },
    ],
  });

  return Response.json({ result: response.text });
}

When to Use This

  • AI-powered research assistants — let the model fetch articles, documentation, and news in real time.
  • Summarization pipelines — scrape long-form content (blog posts, API docs) and summarize.
  • Q&A bots over live websites — the model scrapes your documentation site and answers questions.
  • Market research agents — scrape product pages, pricing, reviews, and synthesize competitive analysis.
  • Vercel deployments — fastCRW runs on Vercel Edge Functions or serverless containers.
  • Multimodal chat — combine fastCRW scrape with vision models to understand pages as text + images.

Limits + Gotchas

  • Rate limiting — fastCRW enforces per-minute and per-day rate limits. If the model calls scrape too frequently, implement throttling in your tool definition.
  • Context window — scraped Markdown can be large. Summarize or truncate before passing to the model so you don't blow the context budget.
  • Error handling — fastCRW returns non-200 status codes for blocked sites or timeouts. Wrap tool execution in try-catch and surface errors gracefully to the model.
  • API key exposure — Never call fastCRW from the browser. Always route through a Next.js API route or edge function.
  • Streaming latency — tool calls in streamText wait synchronously for the fastCRW response. For slow sites, the stream pauses. Consider caching or pre-fetching.
  • CORS — fastCRW API is backend-only. If you need frontend scraping, host a proxy endpoint.

Performance Notes

  • Median latency: 833 ms for HTTP scraping on the Firecrawl benchmark.
  • JS rendering: LightPanda adds ~2s, Chrome rendering adds ~4–6s.
  • Parallelism: The Vercel AI SDK can register multiple tools; the model decides which to call. Parallel fastCRW calls are subject to rate limits.
  • Caching: Implement URL-based caching in your API route to avoid redundant scrapes.

Continue exploring

More from Integrations

View all integrations

Related hubs

Keep the crawl path moving