SearxNG — Operator and Agent User Guide

This guide covers how operators and LLM agents interact with SearxNG: querying directly, using haiven-mcp tools, reading engine health, and troubleshooting.

Primary Usage Path: haiven-mcp

The preferred way to use SearxNG from agent context is through the haiven-mcp search/web and search/and_fetch tools. These handle engine preset selection, error handling, retry logic, and return structured results.

search/web (JSON-RPC via haiven-mcp)

{
  "method": "tools/call",
  "params": {
    "name": "search/web",
    "arguments": {
      "query": "RAG retrieval augmented generation survey",
      "source_type": "academic",
      "num_results": 10
    }
  }
}

source_type presets (post-Phase-2 hardening — only healthy engines):

source_type Engines used
academic google scholar, crossref, arxiv, openalex
news google news, bing news
code github, gitlab, stackexchange, sourcehut
social reddit, mastodon, lemmy
primary wikipedia, wikidata
general All default engines

Notes:
- arxiv routes through Tor (searxng-tor sidecar) — was 100% failing before hardening, now 0% errors.
- pubmed, semantic scholar, reuters, yahoo news, stackoverflow, wikiquote are intentionally absent from all presets.
- brave and duckduckgo are in SearxNG's default engine pool but excluded from all mcp presets — they remain unresponsive (Cloudflare-blocked).

search/and_fetch

Same parameters as search/web, but additionally fetches and parses the content of the top result URLs. Use this when you need page content, not just snippets.

Direct HTTP Query (internal callers)

From any container on the backend network (10.10.1.0/24), SearxNG is reachable at http://searxng:8080. No authentication required (internal path bypasses Traefik and Authentik).

curl -s -X POST "http://searxng:8080/search" \
  -d "q=kubernetes ingress controller&format=json" | jq '.results[:3]'

With engine selection

curl -s -X POST "http://searxng:8080/search" \
  -d "q=arxiv transformer architecture&format=json&engines=arxiv,google+scholar" \
  | jq '.results[:5] | .[] | {title, url, engine}'

With category and time filter

curl -s -X POST "http://searxng:8080/search" \
  -d "q=AI safety&format=json&categories=news&time_range=week" \
  | jq '.results[:5] | .[] | .title'

Query parameters

Parameter Description Example
q Search query (required) docker networking
format Response format json (agents), csv, rss
categories SearxNG categories general, news, it, science
engines Comma-separated engine names arxiv,crossref,openalex
language Language code en-US
pageno Page number 1, 2
time_range Recency filter day, week, month, year
safesearch Safety level 0 (off), 1 (moderate), 2 (strict)

Response structure

{
  "query": "your query",
  "number_of_results": 42,
  "results": [
    {
      "title": "Result Title",
      "url": "https://example.com/...",
      "content": "Snippet text...",
      "engine": "google scholar",
      "score": 0.9,
      "category": "science"
    }
  ],
  "suggestions": [],
  "answers": [],
  "infoboxes": []
}

Reading Engine Health

From inside the container (most reliable — bypasses limiter)

docker exec searxng wget -qO- http://localhost:8080/stats/errors | jq .

The output is a JSON object keyed by engine name. Each entry contains:
- error: exception class (e.g. SearxEngineTooManyRequestsException)
- count: error count since last reset
- percentage: error rate 0–100

Engines absent from the output are healthy (0% errors).

From Prometheus metrics

# All engine error percentages
curl -s http://searxng-metrics:9109/metrics | grep searxng_engine_error_percentage

# Total failing engine count (alert input)
curl -s http://searxng-metrics:9109/metrics | grep searxng_engines_with_errors_count

# Exporter health
curl -s http://searxng-metrics:9109/metrics | grep searxng_scrape_success

From Grafana

Dashboard URL: https://grafana.haiven.site → browse to 09 - AI folder → SearxNG Engine Health
Dashboard UID: searxng-engine-health

Panels: current engines-with-errors count, scrape health indicator, error percentage per engine over time, error class breakdown, currently-failing engines table.

Expected Steady-State Engine Health

After Phase 3b hardening (2026-05-02):

Engine Expected state Reason
arxiv Healthy (0%) Routes through Tor
google scholar Healthy Direct
crossref Healthy Direct
openalex Healthy Direct
startpage Healthy Direct (was broken via Tor in Phase 3a, restored)
wikipedia Healthy Direct
wikidata Healthy Direct
duckduckgo Failing (~100%) Cloudflare CAPTCHA, not in any preset
brave Failing (~50%) Cloudflare rate limit, not in any preset
karmasearch Failing (100%) Access denied, not in any preset

searxng_engines_with_errors_count at steady state: approximately 4 (duckduckgo, brave, karmasearch, karmasearch videos). The SearxNGEnginesDegraded alert threshold is >= 4 for 10m — this may need tuning upward to 5 if the alert fires continuously at steady state.

Troubleshooting

Rate-limited responses (HTTP 429) from internal callers

Symptom: Internal MCP calls or direct curl requests return 429 Too Many Requests.

Cause: The config/limiter.toml pass_ip list does not include the caller's subnet.

Check: What IP does the caller appear as?

# From inside searxng container, check logs for the 429-producing IP
docker logs --tail 100 searxng | grep 429

Fix: Add the caller's subnet to pass_ip in config/limiter.toml:

[botdetection.ip_limit]
pass_ip = [
  "10.10.0.0/24",   # web network
  "10.10.1.0/24",   # backend network (mcp-server, etc.)
  # add new subnet here
]

Then restart SearxNG.

Tor sidecar not routing (arxiv returning TooManyRequests)

Symptom: docker exec searxng wget -qO- http://localhost:8080/stats/errors shows arxiv with errors.

Check: Is searxng-tor healthy?

docker ps --filter "name=searxng-tor" --format "{{.Status}}"
# Expected: Up (healthy)
docker logs --tail 30 searxng-tor
# Expected: "Bootstrapped 100% (done)" near the bottom

Fix — rotate circuit (try a different exit node):

docker exec searxng-tor pkill -HUP tor
# Wait 10–15 seconds, then check logs again
docker logs --tail 20 searxng-tor

Fix — restart Tor sidecar:

docker compose -f /mnt/apps/docker/searxng/docker-compose.yml restart tor

Verify Tor exit IP is different from host:

# From inside searxng container using httpx_socks (same transport SearxNG uses)
docker exec searxng python3 -c "
import httpx
import httpx_socks
with httpx.Client(transport=httpx_socks.SyncProxyTransport.from_url('socks5h://tor:9050')) as c:
    print(c.get('https://api.ipify.org').text)
"
# Should print a Tor exit IP, NOT 99.148.230.21 (host IP)

Authentik returning 404 for external access

Symptom: curl -sI https://search.haiven.site returns HTTP 404 with x-powered-by: authentik.

Current status: This is expected behavior as of 2026-05-02. Authentik middleware is in the chain (confirmed working) but no application provider is configured for search.haiven.site. The service is protected — unauthenticated external users cannot reach SearxNG. A follow-up task is to configure the Authentik provider, which will change the 404 to a proper SSO redirect.

Internal callers are unaffected. They use http://searxng:8080 on the backend network and bypass Traefik entirely.

Metrics exporter not scraping (searxng_scrape_success = 0)

Check: Is searxng-metrics running?

docker ps --filter "name=searxng-metrics" --format "{{.Status}}"
docker logs --tail 50 searxng-metrics

Check: Can the exporter reach searxng?

docker exec searxng-metrics curl -s http://searxng:8080/stats/errors | head -20

Check: Is Prometheus discovering the target?

# Query Prometheus targets API
curl -s http://prometheus:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job == "docker-auto-discovery") | select(.labels.instance | contains("searxng-metrics"))'

SearxNG won't start (settings.yml parse error)

Symptom: Container exits immediately after start.

Check:

docker logs searxng | grep -iE 'error|invalid|yaml|parse'

Common causes:
- YAML indentation error (tabs vs spaces)
- Boolean value as string (use true/false not "true"/"false")
- Invalid engine name in engines: block

Fix: Edit the file, then restart. Note that settings.yml is owned by uid 977 inside the container:

# Write the corrected file through the container
docker exec -i searxng tee /etc/searxng/settings.yml < /mnt/apps/docker/searxng/config/settings.yml
docker compose -f /mnt/apps/docker/searxng/docker-compose.yml restart searxng

Checking Valkey/Redis cache

docker exec searxng-redis redis-cli ping
# Expected: PONG

docker exec searxng-redis redis-cli info server | grep redis_version

Additional Resources