Supported values

Format	Meaning
`markdown`	Clean markdown output (default)
`html`	Cleaned HTML after extraction pipeline
`rawHtml`	Raw fetched HTML
`plainText`	Plain text view
`links`	Extracted absolute links
`json`	Structured extraction result when `jsonSchema` is provided
`extract`	Alias for `json` — accepted for Firecrawl compatibility

You can request multiple formats in a single call: formats: ["markdown", "html", "links"].

Which Format Should You Choose?

The practical rule is simple:

choose markdown when the output is headed into search, RAG, summarization, or LLM prompts,
choose html when you still want cleaned structure,
choose rawHtml only when you truly need the original source,
choose links when discovery matters as much as page content,
and choose json when the end result needs to be schema-shaped.

For most product and retrieval workflows, markdown is the best default because it is compact, readable, and easier to inspect than raw markup.

Common Format Combinations

Combination	Good for
`["markdown"]`	Default page extraction
`["markdown", "links"]`	Content plus local link discovery
`["html", "rawHtml"]`	Debugging the extraction pipeline
`["json"]`	Structured extraction only
`["markdown", "json"]`	Human-readable content plus structured fields

Response shape

Each format populates a corresponding field in the response data object:

Format	Response field	Type
`markdown`	`markdown`	`string`
`html`	`html`	`string`
`rawHtml`	`rawHtml`	`string`
`plainText`	`plainText`	`string`
`links`	`links`	`string[]`
`json` / `extract`	`json`	`object`

Full response schema

Every API response follows this envelope:

{
  "success": true,           // false if the request or target failed
  "data": { ... },           // present on success (scrape/crawl data)
  "error": "...",            // present on failure — human-readable message
  "warning": "..."           // present when something non-fatal happened
}

The exact shape of data depends on what you requested. Do not assume every field is always present.

`data` object (scrape)

Field	Type	Present when
`markdown`	`string \| null`	`formats` includes `markdown` or `json`
`html`	`string \| null`	`formats` includes `html`
`rawHtml`	`string \| null`	`formats` includes `rawHtml`
`plainText`	`string \| null`	`formats` includes `plainText`
`links`	`string[] \| null`	`formats` includes `links`
`json`	`object \| null`	`formats` includes `json` AND `jsonSchema` provided AND LLM configured
`chunks`	`ChunkResult[] \| null`	`chunkStrategy` provided
`warning`	`string \| null`	Target returned error status, anti-bot detected, etc.
`metadata`	`object`	Always

`metadata` object

Field	Type	Description
`title`	`string \| null`	Page `<title>`
`description`	`string \| null`	Meta description
`ogTitle`	`string \| null`	Open Graph title
`ogDescription`	`string \| null`	Open Graph description
`ogImage`	`string \| null`	Open Graph image URL
`canonicalUrl`	`string \| null`	Canonical link
`sourceURL`	`string`	Final URL after redirects
`language`	`string \| null`	`<html lang>` value
`statusCode`	`number`	Target HTTP status code
`renderedWith`	`string \| null`	`"cdp"`, `"http_only"`, or `"http_only_fallback"`
`elapsedMs`	`number`	Total processing time in ms

`ChunkResult` object

Field	Type	Description
`content`	`string`	Chunk text
`score`	`number \| null`	Relevance score (present when `query` + `filterMode` set)
`index`	`number`	Original chunk position

Format aliases

"extract" and "llm-extract" are accepted as aliases for "json". The canonical name is json. All three behave identically — they require jsonSchema for structured extraction.

Implementation Guidance

Three habits keep format usage sane in production:

request only the formats you really consume,
keep metadata with the stored output so later debugging is easier,
and validate data.json in your own application before trusting it as final truth.

If you are debugging extraction quality, request both markdown and json for a while. That makes it easy to compare the page text against the structured output.

Not supported in this release

screenshot — not implemented. Requesting it will return a 422 error.
actions — click/scroll/wait actions are not yet supported. Sending actions will return a 400 error with a message suggesting cssSelector or xpath as alternatives.

If your workload depends on screenshot capture or browser actions, do not assume they exist in the managed cloud.

Formats Reference