Integration-first workspace connecting meeting transcription, email ingestion, document import, task management, AI-assisted drafting, calendar management, and automated research dispatch into a unified pipeline. Central task hub for all Haiven agents.
| Property | Value |
|---|---|
| Domain | work.haiven.site |
| Backend Port | 8030 |
| Frontend Port | 3025 |
| Source | /mnt/apps/src/work-hub/ |
| Docker | /mnt/apps/docker/ai/work-hub/ |
| API Docs | https://work.haiven.site/api/docs |
| Category | AI — Workflow & Productivity |
| Tier | 2 (FastAPI with auto-generated OpenAPI spec) |
Work Hub closes the loop between meetings, emails, deliverables, calendar, and automated research:
queued and "Research:" in its titlework.haiven.site (Traefik HTTPS)
│
├── work-hub-frontend (React 19 SPA, nginx, port 3025)
│ └── /api → reverse proxy to backend
│
└── work-hub (FastAPI, port 8030)
├── work-hub-db (PostgreSQL 16, port 5437)
├── qdrant:6333 (documents collection, 2560d, Cosine)
├── litellm:4000 (qwen3-embedding-4b + glm-4-7-flash)
├── haiven-knowledge:8022 (email send KB logging)
├── meeting-scribe:5010 (webhook source)
└── research-agent:8000 (auto-dispatch via ResearchDispatcher)
Three containers: work-hub (FastAPI backend), work-hub-frontend (React SPA via nginx), work-hub-db (PostgreSQL 16)
Networks: web (frontend via Traefik), backend (all three containers)
GPU: None — embedding and LLM calls route through LiteLLM
open → in_progress → done/blocked/archived, queued → context_ready → in_progress, blocked → in_progress.history JSONB array with timestamp, actor, and from/to values.PATCH /api/v1/tasks/{id}/voice-instructions; each entry carries processed: false flag for agent workflows.PATCH /api/v1/tasks/{id}/artifacts.ResearchDispatcher) checks every 60 seconds for tasks with status=queued and "research" in the title. Automatically transitions queued → context_ready → in_progress and POSTs to research-agent with the task ID.glm-4-7-flash generates a versioned draft.thread_id. Logs each send to haiven-knowledge KB. Optional EMAIL_SIGNATURE appended automatically./health endpoint checks PostgreSQL, Qdrant, and LiteLLM connectivity.| Format | Extractor | Chunking Strategy |
|---|---|---|
| pypdf (page markers + PDF metadata for title); Tesseract OCR fallback for scanned pages | Heading-based (##) or paragraph fallback |
|
| DOCX | python-docx (heading styles → markdown markers) | Heading-based (##) or paragraph fallback |
| EML | stdlib email parser (From/To/Cc/Subject/Date + 14 metadata fields) | Header chunk + body chunks |
| HTML | html2text (markdown conversion, preserves headings) | Heading-based (##) or paragraph fallback |
| CSV | stdlib csv (column headers + rows) | Row groups of 50, headers repeated per chunk |
| ICS | icalendar (event summary, description, attendees, dates) | One chunk per calendar event |
| Markdown | — | Split on ## headers (max 3000 chars/chunk) |
| Plain text | — | Paragraph-based (\n\n), sentence-level fallback |
| PST | readpst (extracts all contained .eml and .ics files) | Per-message, same as EML/ICS |
| Audio/Video | haiven-transcribe (Canary/Parakeet/Whisper Turbo tri-engine) | Meeting-style chunking post-transcription |
| Service | URL | Purpose |
|---|---|---|
| PostgreSQL 16 | work-hub-db:5432 |
Task, meeting, draft, and email sync state storage |
| Qdrant | http://qdrant:6333 |
Vector search (documents collection) |
| LiteLLM | http://litellm:4000 |
Embeddings (qwen3-embedding-4b) + drafting (glm-4-7-flash) |
| Meeting Scribe | http://meeting-scribe:5010 |
Optional — sends approved meetings via webhook |
| haiven-transcribe | http://haiven-transcribe:8000 |
Optional — audio/video transcription |
| research-agent | http://research-agent:8000 |
Optional — receives auto-dispatched research tasks |
| haiven-knowledge | http://haiven-knowledge:8022 |
Optional — email send KB logging |
All environment variables use the WH_ prefix, except Microsoft Graph and email signature vars.
| Variable | Default | Description |
|---|---|---|
WH_DATABASE_URL |
— | postgresql+asyncpg://... async PostgreSQL URL |
WH_QDRANT_URL |
http://qdrant:6333 |
Qdrant server URL |
WH_QDRANT_COLLECTION |
documents |
Qdrant collection name |
WH_LITELLM_URL |
http://litellm:4000 |
LiteLLM proxy URL |
WH_LITELLM_API_KEY |
— | LiteLLM API key |
WH_EMBEDDING_MODEL |
qwen3-embedding-4b |
2560-dimension embedding model |
WH_EMBEDDING_DIMENSIONS |
2560 |
Vector dimensions |
WH_DRAFT_MODEL |
hermes-4.3-36b |
LLM model for AI draft generation |
WH_SCRIBE_URL |
http://meeting-scribe:5010 |
Meeting Scribe service URL |
WH_WEBHOOK_SECRET |
— | HMAC-SHA256 secret for webhook verification |
| Variable | Default | Description |
|---|---|---|
WH_TRANSCRIBE_URL |
http://haiven-transcribe:8000 |
Transcription service URL |
WH_TRANSCRIBE_TIMEOUT |
600 |
Transcription timeout in seconds |
| Variable | Default | Description |
|---|---|---|
WH_RESEARCH_URL |
http://research-agent:8000 |
Research agent base URL |
WH_RESEARCH_API_KEY |
— | Bearer token for research-agent (optional) |
The ResearchDispatcher polls every 60 seconds for status=queued tasks with "research" in the title. It transitions the task queued → context_ready → in_progress and POSTs to {WH_RESEARCH_URL}/research with {"query": "<title without prefix>", "task_id": "<uuid>", "auto_approve": true}.
| Variable | Default | Description |
|---|---|---|
WH_IMAP_ENABLED |
false |
Enable IMAP email connector (feature flag) |
WH_IMAP_HOST |
— | IMAP server hostname (e.g. imap.gmail.com) |
WH_IMAP_PORT |
993 |
IMAP port (993=SSL, 143=STARTTLS) |
WH_IMAP_USERNAME |
— | IMAP login username (usually email address) |
WH_IMAP_PASSWORD |
— | IMAP password (SecretStr, masked in logs and API responses) |
WH_IMAP_USE_SSL |
true |
Use IMAP4_SSL (true) or IMAP4 with STARTTLS (false) |
WH_IMAP_FOLDERS |
INBOX |
Comma-separated folder names to sync |
WH_IMAP_POLL_INTERVAL |
300 |
Seconds between incremental syncs |
WH_IMAP_BATCH_SIZE |
50 |
Max UIDs per IMAP FETCH batch |
WH_IMAP_MAX_BACKFILL_DAYS |
30 |
Maximum days per backfill request |
Both email send and calendar management share one Azure app registration. The OAuth2 refresh token flow is used (no interactive login required at runtime).
| Variable | Default | Description |
|---|---|---|
CONN_EMAIL_OAUTH2_CLIENT_ID |
— | Azure AD app client ID |
CONN_EMAIL_OAUTH2_CLIENT_SECRET |
— | Azure AD app client secret |
CONN_EMAIL_OAUTH2_TENANT_ID |
— | Azure AD tenant ID |
CONN_EMAIL_OAUTH2_REFRESH_TOKEN |
— | Long-lived OAuth2 refresh token (Mail.Send + Calendars.ReadWrite scopes) |
EMAIL_SIGNATURE |
— | Optional plain-text signature appended to every outbound email |
Required Azure AD scopes: https://graph.microsoft.com/Mail.Send, https://graph.microsoft.com/Calendars.ReadWrite, offline_access
When these variables are absent, the email send and calendar endpoints return 503 Service Unavailable.
Secrets are stored in /mnt/apps/docker/ai/work-hub/.env.
Nine PostgreSQL tables (all primary keys are UUIDs, all timestamps are timezone-aware):
| Table | Purpose |
|---|---|
companies |
Client taxonomy (name unique, domain, ai_discovered flag) |
projects |
Projects per company (company FK, name, status) |
tags |
Classification tags (name unique, source) |
tasks |
Work items (title, assignee, status, priority, source, source_application, company/project FKs, context, due_date, voice_instructions JSONB[], artifacts JSONB[], history JSONB[]) |
task_tags |
Many-to-many task/tag junction |
meetings |
Meeting records (scribe_job_id unique, title, attendees JSON, notes_md, qdrant_document_id) |
documents |
Imported doc metadata (title, doc_type, source, source_file, embedding_status) |
drafts |
AI-generated drafts (task FK, content, model, context_chunks JSON, version) |
email_sync_state |
Per-folder IMAP sync state (account, folder, last_uid, uidvalidity, last_sync_at) |
Valid status transitions enforced by validate_status_transition():
open → in_progress → done
open → in_progress → blocked → in_progress
open → archived
queued → context_ready → in_progress
in_progress → done
in_progress → archived
done → archived
| Property | Value |
|---|---|
| Collection | documents |
| Dimensions | 2560 |
| Distance | Cosine |
| Quantization | INT8 scalar (quantile=0.99, always_ram=true) |
Payload fields: document_id, doc_type, source, company, project, topics[], tags[], title, content, attendees[], meeting_type, scribe_job_id, source_file, created_at, ingested_at, chunk_index, total_chunks, email_from, email_to, email_cc, email_subject, email_date, email_message_id, email_in_reply_to, email_references, email_folder, email_importance, email_has_attachments, email_attachment_count, email_attachment_names, email_list_unsubscribe, email_account, parent_email_message_id, calendar_uid, calendar_summary, calendar_start, calendar_event_count, calendar_attendees
Indexed for filtering: doc_type, source, company, project, topics, tags, meeting_type, scribe_job_id (keyword); created_at (datetime)
50+ endpoints across 10 groups:
| Group | Count | Description |
|---|---|---|
| Tasks | 10 | CRUD + voice-instructions + artifacts + history + AI draft generation + draft history |
| Meetings | 3 | List, detail, semantic search |
| Taxonomy | 13 | Full CRUD for companies, projects, tags |
| Import | 5 | Single document + file upload + audio transcription + directory batch + PST archive |
| Backfill | 1 | Ingest historical Meeting Scribe notes |
| Webhooks | 1 | Receive approved meetings (HMAC-verified) |
| Health | 1 | Dependency health checks |
| 6 | IMAP sync, backfill, status, config, folder list + Graph send | |
| Calendar | 3 | List events, create event, delete event (Microsoft Graph) |
Key endpoints:
GET /health # Service health (postgres, qdrant, litellm)
# Tasks
GET /api/v1/tasks # List tasks (status, priority, company_id, project_id, assignee, source, source_application)
POST /api/v1/tasks # Create task
GET /api/v1/tasks/{id} # Task detail
PATCH /api/v1/tasks/{id} # Update task (FSM-validated status transitions)
DELETE /api/v1/tasks/{id} # Hard-delete task
PATCH /api/v1/tasks/{id}/voice-instructions # Append voice instruction
PATCH /api/v1/tasks/{id}/artifacts # Append artifact reference
GET /api/v1/tasks/{id}/history # Full change history array
POST /api/v1/tasks/{id}/draft # Generate AI draft (RAG + glm-4-7-flash)
GET /api/v1/tasks/{id}/drafts # Draft history
# Meetings
GET /api/v1/meetings # List meetings
GET /api/v1/meetings/{id} # Meeting detail
POST /api/v1/meetings/search # Semantic search over meeting notes
# Taxonomy
GET /api/v1/companies # List companies
POST /api/v1/companies # Create company
GET /api/v1/companies/{id} # Company detail
PATCH /api/v1/companies/{id} # Update company
DELETE /api/v1/companies/{id} # Delete company
GET /api/v1/projects # List projects (filter by company_id, status)
POST /api/v1/projects # Create project
GET /api/v1/projects/{id} # Project detail
PATCH /api/v1/projects/{id} # Update project
DELETE /api/v1/projects/{id} # Delete project
GET /api/v1/tags # List tags
POST /api/v1/tags # Create tag
DELETE /api/v1/tags/{id} # Delete tag
# Import
POST /api/v1/import/document # Import single document (text/markdown)
POST /api/v1/import/upload # Multipart file upload (PDF, DOCX, EML, HTML, CSV, ICS, MD, TXT)
POST /api/v1/import/audio # Audio/video transcription import
POST /api/v1/import/directory # Import directory batch
POST /api/v1/import/pst # Outlook PST archive import
# Backfill
POST /api/v1/backfill/scribe-notes # Backfill Meeting Scribe notes
# Webhooks
POST /api/webhooks/scribe # Scribe webhook receiver (HMAC-verified)
# Email — IMAP connector
POST /api/v1/email/sync # Trigger immediate incremental IMAP sync
POST /api/v1/email/backfill # Date-range email backfill
GET /api/v1/email/status # Per-folder IMAP sync state
GET /api/v1/email/config # IMAP config with password masked
GET /api/v1/email/folders # Live IMAP folder list
POST /api/v1/email/send # Send email via Microsoft Graph (20/hr rate limit)
# Calendar — Microsoft Graph
GET /api/v1/calendar/events # List events in date range
POST /api/v1/calendar/events # Create calendar event (409 on conflict)
DELETE /api/v1/calendar/events/{id} # Delete calendar event
Full interactive docs at https://work.haiven.site/api/docs.
GET /api/v1/tasks accepts these query parameters for filtering:
| Parameter | Type | Description |
|---|---|---|
status |
string | Filter by task status (open, in_progress, done, queued, etc.) |
priority |
string | Filter by priority (low, medium, high, critical) |
company_id |
UUID | Filter to tasks for a specific company |
project_id |
UUID | Filter to tasks for a specific project |
assignee |
string | Filter by assignee name |
source |
string | Filter by source field (manual, agent, email, etc.) |
source_application |
string | Filter by source_application (briefing, research_agent, etc.) |
page |
int | Page number (default: 1) |
page_size |
int | Items per page (default: 20, max: 100) |
For agent-to-work-hub write-back:
# Append a voice instruction to a task
PATCH /api/v1/tasks/{id}/voice-instructions
{"instruction": "Make the summary shorter and focus on action items"}
# Append an artifact (research output, briefing draft, etc.)
PATCH /api/v1/tasks/{id}/artifacts
{"type": "research_output", "path": "<session_id>"}
# Update task status and context (research agent write-back)
PATCH /api/v1/tasks/{id}
{"status": "done", "context": "<JSON summary from research>"}
# Create .env file
cat > /mnt/apps/docker/ai/work-hub/.env <<EOF
POSTGRES_USER=workhub
POSTGRES_PASSWORD=<strong-password>
POSTGRES_DB=workhub
WH_LITELLM_API_KEY=<litellm-api-key>
WH_WEBHOOK_SECRET=<random-32-char-hex>
# Research auto-dispatch
WH_RESEARCH_URL=http://research-agent:8000
# Optional: Microsoft Graph (email send + calendar)
# CONN_EMAIL_OAUTH2_CLIENT_ID=<azure-app-client-id>
# CONN_EMAIL_OAUTH2_CLIENT_SECRET=<azure-app-client-secret>
# CONN_EMAIL_OAUTH2_TENANT_ID=<azure-tenant-id>
# CONN_EMAIL_OAUTH2_REFRESH_TOKEN=<long-lived-refresh-token>
# EMAIL_SIGNATURE=Your Name | Title
# Optional: IMAP email connector
# WH_IMAP_ENABLED=true
# WH_IMAP_HOST=imap.gmail.com
# WH_IMAP_USERNAME=you@example.com
# WH_IMAP_PASSWORD=<app-password>
# WH_IMAP_FOLDERS=INBOX,Sent
EOF
# Start all three containers
cd /mnt/apps/docker/ai/work-hub
docker compose up -d
# Wait for DB to initialize, then backfill historical notes
sleep 10
curl -X POST https://work.haiven.site/api/v1/backfill/scribe-notes
cd /mnt/apps/docker/ai/work-hub
docker compose up -d # Start all containers
docker compose down # Stop (preserves DB volume)
docker compose restart work-hub # Restart backend only
cd /mnt/apps/docker/ai/work-hub
docker compose build work-hub --no-cache
docker compose up -d work-hub
docker logs -f work-hub # Backend
docker logs -f work-hub-frontend # Frontend (nginx)
docker logs -f work-hub-db # PostgreSQL
curl https://work.haiven.site/api/health
# Returns: {"status": "healthy", "postgres": true, "qdrant": true, "litellm": true, "version": "1.0.0"}
| Container | Memory Limit | Memory Reservation | CPU Limit |
|---|---|---|---|
| work-hub | 4G | 256M | 4 cores |
| work-hub-frontend | 2G | — | 1 core |
| work-hub-db | 1G | 512M | 2 cores |
Prometheus labels are set on the work-hub container:
prometheus.io/scrape: "true"
prometheus.io/port: "8030"
prometheus.io/path: "/metrics"
The /metrics endpoint exposes standard FastAPI/uvicorn process metrics.
Meeting Scribe sends approved meeting notes to Work Hub via HTTP webhook:
POST /api/webhooks/scribe
X-Hub-Signature-256: sha256=<hmac>
Content-Type: application/json
{
"version": 1,
"job_id": "uuid",
"title": "Meeting Title",
"notes_md": "## Agenda\n...",
"tasks": [],
"decisions": [],
"metadata": {}
}
The webhook handler verifies HMAC-SHA256 using WH_WEBHOOK_SECRET, then chunks and embeds the notes into Qdrant and records the meeting in PostgreSQL.
The ResearchDispatcher runs as a background asyncio task inside the work-hub process:
status=queued and "research" in the title (case-insensitive).queued → context_ready → in_progress (via direct DB update, respecting FSM rules).{WH_RESEARCH_URL}/research:json
{
"query": "RAGAS framework comparison",
"task_id": "uuid",
"auto_approve": true
}PATCH /api/v1/tasks/{id}/artifacts.To trigger a research task manually:
curl -X POST https://work.haiven.site/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{
"title": "Research: RAGAS framework comparison",
"status": "queued",
"source": "agent",
"source_application": "manual"
}'
# Within 60 seconds the task transitions to in_progress and research-agent begins
Work Hub integrates with Microsoft 365 via the Graph API for two capabilities: sending email and managing calendar events. Both use the same Azure AD app registration and the OAuth2 refresh token flow — no interactive login is required at runtime.
POST /api/v1/email/sendthread_id (a Message-ID) to set In-Reply-To and References headersEMAIL_SIGNATURE env var; appended to every outbound message automaticallyhttp://haiven-knowledge:8022/v1/ingest/text) as a fire-and-forget background taskcurl -X POST https://work.haiven.site/api/v1/email/send \
-H "Content-Type: application/json" \
-d '{
"to": "colleague@example.com",
"subject": "Follow-up from today'\''s meeting",
"body": "Hi,\n\nJust following up on the action items..."
}'
GET /api/v1/calendar/events?start=2026-03-01T00:00:00Z&end=2026-03-08T00:00:00Z — returns up to 50 events ordered by start timePOST /api/v1/calendar/events — creates event with attendees; returns 409 Conflict if a scheduling conflict is detectedDELETE /api/v1/calendar/events/{event_id}curl -X POST https://work.haiven.site/api/v1/calendar/events \
-H "Content-Type: application/json" \
-d '{
"summary": "Sprint Planning",
"start": "2026-03-10T14:00:00Z",
"end": "2026-03-10T15:00:00Z",
"attendees": ["alice@example.com", "bob@example.com"],
"description": "Q1 sprint kickoff"
}'
When WH_IMAP_ENABLED=true, Work Hub polls the configured IMAP mailbox in the background:
WH_IMAP_POLL_INTERVAL seconds (default 5 minutes)Document.source_file — re-syncing never creates duplicatesparent_email_message_idEmail metadata endpoints return 503 Service Unavailable when WH_IMAP_ENABLED=false.
# Trigger immediate incremental sync
curl -X POST https://work.haiven.site/api/v1/email/sync
# Backfill emails from the past 7 days
curl -X POST "https://work.haiven.site/api/v1/email/backfill?since_date=2026-02-12&before_date=2026-02-19"
# Check sync state per folder
curl https://work.haiven.site/api/v1/email/status
# List available folders on the IMAP server
curl https://work.haiven.site/api/v1/email/folders
# Check database is healthy
docker ps --filter name=work-hub-db --format "{{.Status}}"
# Check logs for startup errors
docker logs work-hub --tail 50
# Verify .env is present and has required secrets
grep -v PASSWORD /mnt/apps/docker/ai/work-hub/.env
The AI draft agent searches Qdrant for relevant content. If no context is found:
# Check collection has data
curl http://localhost:6333/collections/documents
# Verify embedding model is available in LiteLLM
curl http://localhost:4000/v1/models | grep qwen3-embedding
# Check dispatcher is running in logs
docker logs work-hub --tail 50 | grep -i "dispatcher"
# Verify task has status=queued and "research" in title
curl https://work.haiven.site/api/v1/tasks?status=queued | python3 -m json.tool
# Check WH_RESEARCH_URL is set
docker inspect work-hub | grep -A1 WH_RESEARCH
# Verify Graph OAuth2 credentials are set
docker inspect work-hub | grep -i "CONN_EMAIL"
# Check rate limiter — max 20 sends per hour
docker logs work-hub --tail 50 | grep -i "rate limit\|email"
# Verify the same CONN_EMAIL_OAUTH2_* vars are set (shared with email send)
docker inspect work-hub | grep -i "CONN_EMAIL_OAUTH2"
# Check file size (50 MB limit for documents, 500 MB for audio)
ls -lh /path/to/file
# Verify supported format
# Documents: .pdf .docx .eml .html .htm .csv .ics .md .txt
# Audio: .mp3 .m4a .wav .ogg .webm .mp4 .flac
# Check backend logs for extraction errors
docker logs work-hub --tail 50 | grep -i "upload\|extract"
# Verify readpst is installed inside the container
docker exec work-hub which readpst
# Check logs for extraction errors
docker logs work-hub --tail 50 | grep -i "pst\|readpst"
# Verify feature flag is enabled
grep WH_IMAP_ENABLED /mnt/apps/docker/ai/work-hub/.env
# Check sync state for errors
curl https://work.haiven.site/api/v1/email/status | python3 -m json.tool
# Check backend logs for IMAP errors
docker logs work-hub --tail 100 | grep -i "imap\|email"
Verify WH_WEBHOOK_SECRET matches the secret configured in Meeting Scribe. The signature is HMAC-SHA256 of the raw request body.
# Verify both services are on backend network
docker network inspect backend | grep -E "work-hub|qdrant"
# Test connectivity from inside container
docker exec work-hub curl -sf http://qdrant:6333/health
The nginx frontend proxies /api to http://work-hub:8030. Verify the backend container is healthy:
docker ps --filter name=work-hub --format "{{.Status}}"
curl http://localhost:8030/health
All phases complete. Production-ready with 3 healthy containers. Integrated with research-agent (auto-dispatch), agent-briefing (task read/write), haiven-notification-hub, and Microsoft 365 (email send + calendar).
documents collection)qwen3-embedding-4b) and LLM proxy (glm-4-7-flash)