Autonomous web research agent with iterative LLM synthesis, work-hub task integration, and knowledge base auto-ingest.
| Interface | URL | Purpose |
|---|---|---|
| Web UI | https://research.haiven.site |
Interactive research interface |
| API | http://localhost:8010 |
Direct REST access |
| API Docs | https://research.haiven.site/docs |
Swagger UI |
| WebSocket | wss://research.haiven.site/ws/research/{session_id} |
Live progress stream |
| Langfuse | https://ai-ops.haiven.site |
Traces and observability |
curl -X POST http://localhost:8010/research \
-H "Content-Type: application/json" \
-d '{
"query": "RAGAS evaluation framework comparison 2025",
"max_iterations": 5
}'
Response:
{
"session_id": "abc123-def456",
"status": "pending",
"ws_url": "/ws/research/abc123-def456",
"message": "Research session started"
}
curl -X POST http://localhost:8010/research \
-H "Content-Type: application/json" \
-d '{
"query": "Kubernetes autoscaling best practices",
"max_iterations": 3,
"model": "qwen3-30b-a3b-q8-abl",
"auto_approve": true,
"domains": {
"exclude": ["pinterest.com", "quora.com"]
},
"force": true
}'
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
string | required | Research question (3–1000 chars) |
max_iterations |
int | 5 | Max research rounds (1–10) |
model |
string | null | LiteLLM model name (null = default) |
auto_approve |
bool | false | Auto-approve follow-up questions |
domains.exclude |
array | [] | Domains to exclude from search results |
force |
bool | false | Skip deduplication check |
task_id |
string | null | Work-hub task UUID for artifact write-back |
When task_id is provided, research outputs are automatically written back to the work-hub task on completion.
# 1. Create a work-hub task
TASK_ID=$(curl -s -X POST http://localhost:8030/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{"title": "Research: RAGAS evaluation frameworks", "status": "queued"}' | jq -r '.id')
# 2. Start research tied to the task
curl -X POST http://localhost:8010/research \
-H "Content-Type: application/json" \
-d "{
\"query\": \"RAGAS evaluation frameworks for RAG pipelines\",
\"task_id\": \"${TASK_ID}\",
\"auto_approve\": true,
\"max_iterations\": 3
}"
# 3. After completion, verify artifact in work-hub
curl http://localhost:8030/api/v1/tasks/${TASK_ID} | jq '.context'
Work-hub tasks with "Research:" in the title and status=queued are automatically dispatched to the research agent by the work-hub dispatcher (polls every 60 seconds). No manual API call needed.
On COMPLETED, the following is written to the work-hub task:
{
"type": "research_output",
"version": 1,
"session_id": "<uuid>",
"query": "<original query>",
"sources": ["<url1>", "<url2>", "..."],
"summary": "<final synthesized answer>",
"created_at": "<ISO8601>"
}
curl http://localhost:8010/research/{session_id}
Response:
{
"session_id": "abc123",
"query": "RAGAS evaluation frameworks",
"status": "synthesizing",
"iteration": 2,
"max_iterations": 5,
"open_questions": null,
"results": null,
"created_at": "2026-02-25T10:00:00Z",
"updated_at": "2026-02-25T10:02:30Z"
}
Connect for real-time events:
const ws = new WebSocket('wss://research.haiven.site/ws/research/' + sessionId);
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
// Event types:
// status_change — pipeline state changed
// search_complete — web search finished
// crawl_progress — crawling X of Y URLs
// synthesis_update — partial synthesis available
// questions_ready — open questions need approval
// completed — research finished
// error — something went wrong
console.log(data.event, data);
};
PENDING
-> SEARCHING (querying SearXNG for URLs)
-> CRAWLING (extracting content via Crawl4AI)
-> CLEANING (deduplication and content processing)
-> SYNTHESIZING (LLM generating answer from sources)
-> VALIDATING (checking if more iterations needed)
-> [AWAITING_APPROVAL] (if auto_approve=false)
-> COMPLETED (research finished, artifacts written)
-> FAILED (unrecoverable error)
When research determines more information is needed and auto_approve=false:
curl -X POST http://localhost:8010/research/{session_id}/approve \
-H "Content-Type: application/json" \
-d '{
"approved_questions": [
"What chunk sizes work best for RAG?",
"How does RAGAS handle multi-hop reasoning?"
]
}'
The CLI runs inside the container:
# Add alias for convenience
alias research='docker exec -it research-agent research-cli'
# Basic query (auto-approve, 3 iterations)
research query "Vector database comparison 2025" --auto-approve --max-iterations 3
# Full options
research query "RAG best practices" \
--max-iterations 5 \
--model qwen3-30b-a3b-q8-abl \
--auto-approve \
--output markdown \
--exclude pinterest.com
# List sessions
research list --status completed --limit 10
# Get a specific session
research get <session_id>
# Find similar past research
research similar "How does RAG work?"
# Cleanup old sessions
research cleanup --days 30 --status failed
# All sessions (default: last 20, sorted by newest)
curl http://localhost:8010/research
# Filter by status
curl "http://localhost:8010/research?status=completed&limit=10"
# With pagination
curl "http://localhost:8010/research?offset=20&limit=20"
Before starting new research, check if similar research exists:
curl -X POST http://localhost:8010/research/similar \
-H "Content-Type: application/json" \
-d '{"query": "RAGAS evaluation frameworks", "limit": 5}'
If similarity > 0.85 (the deduplication threshold), the POST /research endpoint returns existing sessions instead of starting new research. Use force: true to override.
# Recent completed sessions
curl "http://localhost:8010/history?status=completed&limit=20"
# Search by query text
curl "http://localhost:8010/history?query_text=RAGAS"
# Get details
curl http://localhost:8010/history/{session_id}
# Dry run — see what would be deleted
curl -X POST "http://localhost:8010/history/cleanup?days=30&dry_run=true"
# Delete sessions older than 30 days
curl -X POST "http://localhost:8010/history/cleanup?days=30&dry_run=false"
# Delete only failed sessions older than 7 days
curl -X POST "http://localhost:8010/history/cleanup?days=7&status=failed&dry_run=false"
Store credentials for sites requiring login to enable authenticated crawling.
# Store credentials
curl -X POST http://localhost:8010/credentials \
-H "Content-Type: application/json" \
-d '{
"domain": "example.com",
"username": "user@example.com",
"password": "secret123",
"credential_type": "basic"
}'
# List domains with stored credentials
curl http://localhost:8010/credentials
# Check if credentials exist
curl http://localhost:8010/credentials/example.com/check
# Delete credentials
curl -X DELETE http://localhost:8010/credentials/example.com
Credentials are encrypted with Fernet (AES-128-CBC) and stored in Redis with a 24-hour TTL. Passwords are never logged or traced to Langfuse.
All completed research outputs are automatically ingested into haiven-knowledge under source_application=research_agent. Search them:
# Search all research outputs
curl -X POST http://localhost:8022/v1/search \
-H "Content-Type: application/json" \
-d '{
"query": "RAGAS evaluation",
"source_application": "research_agent",
"limit": 5
}'
Research outputs appear in the knowledge base within seconds of the COMPLETED transition.
Effective queries are specific and research-oriented:
| Good | Less Effective |
|---|---|
| "Best practices for RAG chunking strategies with pgvector 2025" | "Tell me about AI" |
| "Compare KEDA vs HPA for Kubernetes autoscaling workloads" | "Kubernetes scaling" |
| "How to implement semantic caching with Redis for LLM APIs" | "LLM caching" |
| Use Case | max_iterations | auto_approve |
|---|---|---|
| Quick overview | 1–2 | true |
| Thorough research | 5 | false |
| Deep dive | 10 | false |
| Automated pipeline / work-hub | 3–5 | true |
Common exclusions for cleaner results:
{
"domains": {
"exclude": ["pinterest.com", "quora.com", "twitter.com", "facebook.com"]
}
}
The service detected an existing session with > 85% semantic similarity. Either use the existing result or force new research:
curl -X POST http://localhost:8010/research \
-H "Content-Type: application/json" \
-d '{"query": "...", "force": true}'
# Check SearXNG health
docker exec research-agent curl -sf http://searxng:8080/healthz
# Check network connectivity
docker exec research-agent ping -c 2 searxng
# View logs
docker logs research-agent --tail 50
# Check Crawl4AI
docker exec research-agent curl -sf http://crawl4ai:11235/health
# Crawl4AI may be rate-limited by target sites — this is normal for high iterations
Use a faster model or reduce iteration count:
curl -X POST http://localhost:8010/research \
-H "Content-Type: application/json" \
-d '{"query": "...", "model": "qwen3-30b-a3b-q8-abl", "max_iterations": 2}'
# Verify work-hub is reachable from research-agent container
docker exec research-agent curl -sf http://work-hub:8030/health
# Check research-agent logs for artifact write-back errors
docker logs research-agent --tail 30 | grep -i "workhub\|task_id\|artifact"
curl http://localhost:8010/health
# Expected: {"status": "healthy", "checks": {"database": "ok", "data_dir": "ok", "cache_dir": "ok"}}
# Readiness probe
curl http://localhost:8010/health/ready
# Liveness probe
curl http://localhost:8010/health/live
View full trace trees at https://ai-ops.haiven.site:
1. Navigate to Traces
2. Filter by service tag: research-agent
3. Use task_id tag to correlate with work-hub tasks