Documentation.
Interface protocols for autonomous agents and data-driven systems.
Benchmarked against Firecrawl scrape-content-dataset-v1 — 1,000 real-world URLs. Full benchmark comparison ↗
Authentication
Include your secret key in the Authorization header. All production endpoints require a valid Bearer token.
Power Terminal
The Playground is a live debugging interface.
Authenticated_Sync
Logged-in users execute real-time extractions. The terminal handles Job IDs and polling automatically, displaying live progress.
Credit_System
Successful playground executions deduct credits from your account. Insufficient credits will block execution until upgrade.
Endpoints
/v1/scrape
Instantly extract content from a single URL. Returns the final data payload immediately.
Request_Sample
{
"url": "https://example.com",
"formats": ["markdown"],
"onlyMainContent": true,
// Optional: target specific elements
"cssSelector": "article",
"xpath": "//h1",
// Optional: chunking for RAG
"chunkStrategy": { "type": "topic" },"
"query": "memory safety",
"filterMode": "bm25",
"topK": 5,
// Optional: stealth & proxy
"stealth": true,
"proxy": "http://user:pass@host:8080"
}Verified_Output
{
"success": true,
"data": { "markdown": "...", "metadata": { ... } }
}/v1/map
Build complete sitemaps. For larger domains, this endpoint returns an id for asynchronous polling.
/v1/crawl
Deep crawling engine. Always returns a Job ID immediately.
Polling_Request
curl https://api.fastcrw.com/v1/crawl/JOB_IDJob_Response
{
"success": true,
"id": "589a4d6c-0e7a..."
}Job Polling
Extractions like Crawl and Map may take up to 2 minutes. Clients should poll the status endpoint every 2-5 seconds until status: "completed" is reached.
Job_Lifecycle
- active/ Job queued or running
- completed/ Data payload ready
- failed/ Upstream error encountered
Advanced
CSS Selector & XPath
Extract a specific part of the page before conversion. Pass cssSelector or xpath to narrow the DOM — readability scoring is bypassed automatically.
CSS_Selector
{
"url": "https://news.ycombinator.com",
"formats": ["markdown"],
"cssSelector": "td.title",
"onlyMainContent": false
}XPath
{
"url": "https://news.ycombinator.com",
"formats": ["markdown"],
"xpath": "//span[@class='titleline']/a"
}Chunking & Filtering
Split scraped Markdown into chunks for vector databases and RAG pipelines. Results appear in data.chunks. Combine with query + filterMode to rank by relevance.
Topic_Chunks
{
"url": "https://en.wikipedia.org/wiki/Rust",
"chunkStrategy": { "type": "topic" },
"query": "memory safety ownership",
"filterMode": "bm25",
"topK": 5
}Sentence_Chunks
{
"chunkStrategy": {
"type": "sentence",
"maxChars": 500
}
}Strategy_Reference
topicSplit on # headingssentenceSplit on .!? boundariesregexCustom delimiter pattern
Filter_Mode
bm25Keyword relevancecosineTF-IDF similarity
Stealth & Proxy
Reduce bot-detection fingerprinting. When stealth is true, CRW rotates the User-Agent from a pool of real Chrome, Firefox, and Safari strings and injects 12 browser-like headers.
Stealth_Mode
{
"url": "https://example.com",
"stealth": true
}Per_Request_Proxy
{
"url": "https://example.com",
"proxy": "http://user:pass@host:8080"
}Self-Hosting (AGPL-3.0)
Run the FASTCRW engine on your own hardware. Zero monthly fees, infinite concurrency.