Alternatives/Alternative / Self-hosted search API

Self-Hosted Search API — A Devops Guide (2026)

Self-host search for AI agents: data residency, hardening, threat model, and operational concerns. Compares fastCRW, raw SearXNG, OrioSearch, agent-search, Vane.

Published

May 9, 2026

Updated

May 9, 2026

Why self-host search at all

Self-hosting a search API is rarely fun. The reasons people do it anyway:

Data residency. User queries are PII in regulated industries. Sending them to a US cloud breaks GDPR, HIPAA, or SOC 2 in ways your security team won't sign off on.
Vendor risk. Tavily was acquired by Nebius in February 2026; the deal hadn't closed by May. Healthy companies still get acquired, sunset products, change pricing. Self-hosted means you control the timeline.
Cost at scale. The break-even between API credits and a self-hosted server lands somewhere around 5K–10K req/mo. Below that, paid APIs win. Above that, the math compounds in your favor.
Regulatory. Some workloads (gov, defense, finance) literally cannot ship outbound queries.
Operational simplicity. One fewer API key, one fewer rate-limit page, one fewer external SLA in your dependency graph.

If none of these apply, don't self-host search. Use Tavily, Serper, or SerpAPI and move on. The ops cost is real; this page is for teams where the trade-off is favorable.

OSS self-hosted search APIs — comparison matrix

Project	License	Architecture	Auth/Limits	Content extraction	MCP	Hardening config
fastCRW	AGPL-3.0	Rust + bundled SearXNG	Bearer token (optional self-host)	Yes (`/v1/scrape`)	Yes	Read-only rootfs, dropped caps, mem limits, pinned image
SearXNG (raw)	AGPL-3.0	Python aggregator	None	No	No	You build it
OrioSearch	MIT	Python + SearXNG + Redis	Bearer token, timing-safe	Yes (trafilatura/readability)	No	Compose default
agent-search	MIT	FastAPI + SearXNG	Bearer token	Yes (9-strategy fallback chain)	Yes	Compose default + optional Tor
Vane (was Perplexica)	MIT	Chat UI + SearXNG	UI auth	Built-in	No	Chat product, not API

The matrix highlights the gap: SearXNG-direct gives you search aggregation but ships none of the wrapper concerns. Each of the other projects is a different opinion on what wrapper to build.

Hardening — what fastCRW's compose stack actually does

# Excerpt from docker-compose.yml — full file in the repo
services:
  searxng:
    image: searxng/searxng:2026.4.27-... # pinned tag, NOT latest
    read_only: true                      # rootfs read-only
    cap_drop:
      - ALL                              # drop all Linux caps
    security_opt:
      - no-new-privileges:true           # no privilege escalation
    mem_limit: 512m                      # memory cap
    pids_limit: 256                      # PID cap (fork bomb mitigation)
    tmpfs:
      - /tmp:size=64m                    # writable tmp on tmpfs
    volumes:
      - ./config/searxng/settings.yml:/etc/searxng/settings.yml:ro  # config RO

What this gets you:

Compromised SearXNG can't write to its own filesystem (read-only rootfs).
Compromised SearXNG can't gain privileges (no-new-privileges, dropped caps).
Compromised SearXNG can't fork-bomb the host (pids_limit).
Memory pressure is bounded (mem_limit).
Image pinning prevents supply-chain drift — no auto-updating to a :latest tag.

What it does NOT cover:

Application-level vulnerabilities in SearXNG or fastCRW (those need patching via image bumps).
Prompt injection in scraped content — that's an application concern, not infrastructure.
DDoS at the edge — put fastCRW behind a CDN or load balancer with rate limiting.

Threat model

User query → fastCRW HTTP layer → SearXNG sidecar → upstream engines
                ↓                         ↓
       (auth, rate limit)         (no internet egress
       (input validation)          beyond search engines)

Threat	Mitigation
Prompt injection in scraped content reaching LLM	Content sanitization on `/v1/scrape` (agent-search calls this "prompt-injection scrubbing"; fastCRW does its own version)
SSRF — `/v1/scrape` accepting `http://localhost`, `http://169.254.169.254/`, etc.	URL validation, reject loopback/link-local/private, configurable allowlist
Resource exhaustion — 10 GB page download	Response size cap (fastCRW caps at 10MB by default)
Compromised sidecar attempts host escape	cap_drop, no-new-privileges, read-only rootfs
Vulnerable SearXNG version	Pinned image tag forces deliberate version bumps; subscribe to upstream security feed
API key leakage	Bearer auth optional in self-host (intentional — many self-hosters run inside a private network)

If your environment requires more, layers above this stack (network policies, mTLS, egress firewalls) are where to add them — none are blocked by the compose default.

How to deploy fastCRW (the 2-minute path)

# 1. Clone
git clone https://github.com/us/crw && cd crw

# 2. Configure
cp .env.example .env
# Optional: set CRW_API_TOKEN if you want bearer auth on
vim .env

# 3. Boot
docker compose up --build
# Stack: fastCRW (:8080) + SearXNG sidecar + Redis

# 4. Smoke test
curl -X POST http://localhost:8080/v1/search \
  -H "Content-Type: application/json" \
  -d '{"query": "site:nist.gov password rotation guidance", "limit": 5}'

# 5. Verify health
curl http://localhost:8080/health

For production: put it behind your existing reverse proxy / WAF / CDN. The fastCRW binary speaks plain HTTP on :8080 by design — TLS termination is your edge layer's job.

Operational concerns

Upstream rate limits

SearXNG queries Google/Bing/DuckDuckGo/Brave directly from your server's IP. At meaningful QPS, two things happen:

Captchas. Google in particular fingerprints repeat-querier traffic. Mitigation: enable engine rotation in settings.yml, configure Brave Search API for a paid lane.
Soft bans. Some engines simply rate-limit known IPs. Mitigation: rotate egress IPs (residential proxies, multiple VPS), or accept lower throughput.

This is the operational tax. fastCRW Cloud absorbs it; self-host means you own it.

Observability

The compose stack ships:

/health endpoint (open, JSON)
/tool-schema endpoint (open, JSON, lists MCP tool surface)
structured tracing via OpenTelemetry env vars (set OTEL_EXPORTER_OTLP_ENDPOINT)

If you have a Grafana stack, point it at the OTLP endpoint and you get latency, error rate, and per-engine breakdown out of the box.

Backup and disaster recovery

Stateless. fastCRW + SearXNG + Redis are all stateless within a deployment — no database to back up. Configuration lives in your .env and config/searxng/settings.yml, both of which should live in your config-management repo. To DR: re-deploy the compose stack on a new host.

Where each option fails

fastCRW: at high QPS, upstream rate limits hit (above). At very low resource budgets, the Rust runtime is overkill — but at ~8 MB image and ~6.6 MB idle RAM, it's not really a constraint anyone hits.
SearXNG-direct: zero auth, zero rate limit, zero extraction. You build all of it. That's not a flaw — it's the explicit shape — but plan for it.
OrioSearch: small project (~22 stars). Maintenance risk. If your team has Python expertise and the project's compatibility shape matches your needs, it's a real option; budget for forking later.
agent-search: ~25 stars, similar maintenance concern. The Tor stack adds operational surface area.
Vane: not an API. Chat UI. Different shape entirely.

When to pay for hosted instead

Self-hosting is the right call when one of the five reasons at the top of this page applies. If none does, paying for Tavily, Serper, SerpAPI, or fastCRW Cloud is cheaper than your team's time. The ops tax of self-hosting is real:

~30 minutes of attention per week (engine rotation, image bumps, captcha investigation),
on-call rotation if it's user-facing,
one engineer who actually understands the stack when it breaks.

Most teams underestimate this until they're three months in. Plan for it honestly.

Three calls to action

Deploy now — docker compose up. Quickstart →
Audit the hardening config — docker-compose.yml on GitHub →
Skip the ops — hosted plan, same API surface →

Sources

fastCRW docker-compose hardening config

https://github.com/us/crw/blob/main/docker-compose.yml

SearXNG documentation

https://docs.searxng.org/

OrioSearch repository

https://github.com/vkfolio/orio-search

agent-search repository

https://github.com/brcrusoe72/agent-search

FAQ

Why self-host a search API at all?

Five real reasons: (1) data residency — queries don't leave your perimeter; (2) vendor risk — no rate-limit surprise, no acquisition fallout, no API deprecation; (3) cost at scale — the $0.008/credit math breaks at high QPS; (4) regulatory — some industries can't ship user queries to a US cloud; (5) cleanup — one fewer external API key in your secrets store. None of these is 'I love yak-shaving Docker compose'; all five are real production reasons.

Is raw SearXNG enough on its own?

It depends on what you need. SearXNG aggregates engines and returns ranked results via /search?format=json. It does NOT ship API keys, rate limits, multi-tenant auth, content extraction, or LLM grounding. If you only need 'search results JSON behind your firewall,' SearXNG-direct is fine. If you need the wrapper, that's why fastCRW, OrioSearch, and agent-search exist.

What's actually hardened in the fastCRW compose stack?

The SearXNG sidecar runs read-only rootfs, dropped Linux capabilities (caps drop ALL except needed), no-new-privileges, mem_limit and pids_limit set, pinned image tag (no latest), and config mounted read-only. The fastCRW Rust binary has its own image (~8 MB) and matching hardening. The full config is in docker-compose.yml in the repo — you can audit before deploying.

How do upstream rate limits work when self-hosting?

SearXNG queries Google, Bing, DuckDuckGo, Brave, etc. directly from your server. At low QPS this works; at high QPS upstream engines start serving captchas or blocking. Mitigation: configure SearXNG to use Brave Search API (a real key gives you rate-limit headroom), enable engine rotation, or run multiple SearXNG instances behind a load balancer. None of this is invisible — it's the operational tax of self-hosting.

Does self-hosting actually save money?

Above ~5K req/mo, yes — but you're paying server time and ops hours instead of API credits. A Hetzner CX22 ($5/mo) handles light workloads; CX42 ($15/mo) handles 100K req/mo of typical load. At 1M req/mo you're looking at a small cluster (~$60–$100/mo) plus monitoring. Compared to Tavily PAYG ($8,000/mo at 1M), the savings are real — but the operational hours are also real.

What about captchas and bot detection?

Real concern. Google in particular fingerprints repeat-querier traffic. SearXNG mitigates with engine rotation, but at meaningful scale you'll hit captchas. The honest fix is mixing in API-keyed engines (Brave, Kagi, Mojeek) that don't captcha their paying customers. fastCRW Cloud handles upstream rotation; if you're self-hosting and hit this wall, you have three options: pay for Brave Search API, accept lower throughput, or move to managed.

What's the simplest threat model for a self-hosted search API?

Inputs are user-controlled query strings. Outputs include scraped third-party content. Threats: (1) prompt injection in scraped content reaching your LLM — mitigate with content sanitization; (2) SSRF if your scrape endpoint accepts arbitrary URLs without validation — fastCRW validates URLs and rejects internal/loopback; (3) resource exhaustion from large pages — mitigate with response size caps (fastCRW caps at 10MB). The compose stack hardening covers infrastructure-level threats; application-level mitigation is in the code.

Can I self-host without Docker?

fastCRW: yes, the Rust binary builds with cargo build --release; you bring your own SearXNG and Redis. SearXNG: yes, it has a uwsgi/gunicorn config for non-Docker deployments. OrioSearch and agent-search: technically yes, but their docker-compose is the supported path. Practical answer: Docker compose is the path of least resistance for all four.

Recommended next step

Deploy the single-binary stack yourself.

Use the self-host guide when you want full infra control, lower spend, or private data handling.

Self-Host in 30 Seconds

Continue exploring

More from Alternatives

View all alternatives

Previous in Alternatives

Firecrawl Self-Hosted Rust Crate — Two Paths in 2026

Next in Alternatives

Tavily-Style Search API — Free to Self-Host (2026)

Alternatives

Apify vs fastCRW: When to Migrate (2026)

A 1:1 deep comparison for teams already on Apify and evaluating fastCRW. Migration triggers, request-shape diff, rental-Actor sunset checklist, pricing math at three scales, and the cases where Apify is still the right call.

apify vs fastcrwBottom-funnel 1:1 comparison — for teams already using Apify, not for shoppers

Alternatives

Firecrawl Alternative in 2026 — fastCRW (Self-Host, Compatibility Matrix)

Firecrawl alternative comparison: fastCRW is Firecrawl-compatible on the /scrape, /crawl, /map, /search overlap surface, runs as a single Rust binary (~6.6 MB idle), and is honest about what Firecrawl does that we don't. Migration matrix + branded long-tail Q&A.

firecrawl alternativeFirecrawl-compatible on overlap surface (/scrape, /crawl, /map, /search) with documented response-field divergences

Alternatives

Open-Source Tavily Alternatives — What They Actually Do

Tavily is closed-source. Here's an honest comparison of OSS search APIs you can self-host: fastCRW, OrioSearch, agent-search, SearXNG-direct, and Vane (formerly Perplexica).

open source tavilyFive live OSS projects compared on license, scale, MCP, and Tavily-shape compat

Related hubs

Keep the crawl path moving

Benchmarks

Validate comparison claims against methodology and measured results.

Use Cases

See where fastCRW fits after the vendor comparison stage.

Docs

Move from evaluation into endpoint and deployment details.

Self-Hosted Search API — A Devops Guide (2026)

Why self-host search at all

OSS self-hosted search APIs — comparison matrix

Hardening — what fastCRW's compose stack actually does

Threat model

How to deploy fastCRW (the 2-minute path)

Operational concerns

Upstream rate limits

Observability

Backup and disaster recovery

Where each option fails

When to pay for hosted instead

Recommended next reads

Three calls to action

More from Alternatives

Apify vs fastCRW: When to Migrate (2026)

Firecrawl Alternative in 2026 — fastCRW (Self-Host, Compatibility Matrix)

Open-Source Tavily Alternatives — What They Actually Do

Keep the crawl path moving

Benchmarks

Use Cases

Docs