Docs/Docs / Changelog

Changelog

Release notes and recent changes to the fastCRW API and platform.

Published
March 11, 2026
Updated
March 11, 2026
Category
docs
Track every releaseBreaking changes flaggedEngine and platform updates

2026-03-11 — Engine v0.0.8

This release focused on two themes:

  • making extraction behavior more reliable on real-world content,
  • and making the product surface easier to understand through clearer docs and validation.

Engine (CRW)

  • Wikipedia / MediaWiki onlyMainContent fixonlyMainContent: true now correctly extracts article text from Wikipedia pages (~49% size reduction). Previously the noise handler matched "toc" as a substring inside "vector-toc-available" on the <html> element, removing the entire page.
  • 3-tier noise pattern matching — noise class/id matching now uses substring (long patterns), exact-token (short/ambiguous: toc, share, social, comment, related), and prefix (ad-, ads-) matching to avoid false positives on real content.
  • Structural element guard — noise handler never removes <html>, <head>, <body>, or <main> elements.
  • Re-clean after readability — readability output is re-cleaned to strip residual noise (infobox, navbox, catlinks) inside broad containers.
  • Wikipedia-aware readability — added .mw-parser-output, #mw-content-text, #bodyContent to scored selectors; selectors wrapping >90% of body are skipped.
  • BYOK LLM extraction — per-request llmApiKey, llmProvider, llmModel fields for bring-your-own-key structured extraction without server config.
  • JSON format validationformats: ["json"] without jsonSchema now returns a 400 error instead of a warning.
  • Block detection skip — pages >50 KB skip interstitial/block detection (no more false "blocked by anti-bot" on Wikipedia).
  • Null byte protection — URLs containing %00 or null bytes are rejected at the validation layer.
  • Request timeout — default bumped from 60s to 120s.
  • Dockerfile fix — corrected cargo build flags, added config.docker.toml.

Platform

  • BYOK docs — added BYOK (llmApiKey/llmProvider/llmModel) documentation to scrape and extract docs.
  • Free tier 500 credits — free tier increased from 50 to 500 credits.
  • Pricing clarity — "Regular $XX/mo" strikethrough compare-at-price, launch end date (June 1, 2026), rate lock guarantee.
  • About page — new /about page with mission, open-source philosophy, and contact info.
  • Trust section — stats section on landing page (AGPL-3.0, 6.6 MB, 99.9%, 500 credits).
  • Validation errors — upstream 422 errors now include specific guidance about valid format names and jsonSchema requirements.
  • Header cleanup — removed Via header from responses.
  • Documentation — added docs for error codes, rate limits, credit costs, JS rendering, formats, SDK examples, MCP integration, compatibility matrix, and self-hosting hardening.

Upgrade notes

  • Re-test any extraction workflow that depends on Wikipedia or MediaWiki-style content because onlyMainContent behavior is now more aggressive and more accurate.
  • If you were relying on permissive json requests without a schema, update the client now; those requests return a 400 error in this release.
  • If you self-host, pull the latest container image so the Dockerfile and config changes land together.

2026-03-10

Initial Release

  • Scrape, crawl, and map endpoints — Firecrawl-compatible API shape.
  • Markdown-first extraction with readability scoring.
  • CSS/XPath selectors, tag include/exclude filtering.
  • BM25 and cosine similarity chunk filtering.
  • LLM-based structured extraction with JSON Schema validation.
  • JS rendering via LightPanda CDP.
  • Stealth mode with browser-realistic UA rotation.
  • Credit-based billing with Stripe integration.
  • Self-hosting support with single-binary deployment.

Release framing

The first release established the core product surface: a Firecrawl-compatible scrape, crawl, and map API with markdown-first extraction, optional browser rendering, and a path to self-hosting. Later releases should be read as refinements on top of that baseline, not a new product direction.