haiven-agent-briefing

Core AI agent service for the Haiven platform. Provides 9 scopes covering daily intelligence briefings, document drafting, email composition, content publishing, and opportunity detection. All scopes are accessible through a single POST /briefing endpoint and a standardized Agent Protocol endpoint.


Scopes at a Glance

Scope Purpose Model Scheduled
daily Morning briefing: tasks + KB GLM-4.7-Flash Mon–Fri 07:30
end_of_day EOD summary: completions + carry-over GLM-4.7-Flash Mon–Fri 17:30
review Revise a task artifact with voice instructions GLM-4.7-Flash On-demand
context_assembly 4-facet parallel KB search, writes context to task GLM-4.7-Flash On-demand
pre_meeting Meeting prep with KB context, 15-min APScheduler poll GLM-4.7-Flash APScheduler
document 7-step structured document draft via qwen3.5-27b qwen3.5-27b On-demand
email GLM tone classification + qwen3.5-27b body composition GLM + qwen3.5-27b On-demand
opportunity Daily KB scan for blog ideas, patterns, follow-ups GLM-4.7-Flash APScheduler
content_pipeline Blog publish (qwen3.5-27b -> WordPress) + social snippets qwen3.5-27b + GLM On-demand

API Endpoints

POST /briefing

Generate a scope-specific briefing. All scopes use this single endpoint.

Request body:

{
  "scope": "daily",
  "template": "briefing-daily",
  "notify": false,
  "task_id": null,
  "voice_instructions": null,
  "extract_tasks": false
}
Field Type Default Description
scope string "daily" One of the 9 scopes listed above
template string "briefing-daily" Langfuse prompt name; falls back to hardcoded template if unavailable
notify bool false Push result to notification-hub when true
task_id string|null null Required for review, context_assembly, document, content_pipeline; optional for email
voice_instructions string|null null Scope-dependent — see per-scope detail below
extract_tasks bool false After generating, make a second LLM call to extract actionable tasks and POST them to work-hub (daily scope only)

Response:

{
  "scope": "daily",
  "briefing": "Your morning briefing text...",
  "data_sources": {
    "open_tasks": 12,
    "completed_tasks": 3,
    "kb_results": 8,
    "template": "briefing-daily",
    "model": "glm-4-7-flash"
  },
  "notified": false,
  "created_task_ids": []
}

Error responses:

Status Condition
400 Unknown scope, or required fields missing for the chosen scope
401 Bearer token missing or invalid (when BRIEFING_API_KEY is set)
404 task_id not found in work-hub
502 Document or email generation failed in upstream LLM
500 Upstream service unreachable (LiteLLM, work-hub)

POST /v1/agent

Standardized Agent Protocol endpoint. Maps intents to briefing scopes.

Request:

{
  "user_message": "Give me my morning briefing",
  "intent": "briefing.daily",
  "entities": {"instruction": "focus on high-priority items"},
  "task_id": null
}

Intent map:

Intent Scope
briefing.daily daily
briefing.weekly end_of_day
scheduling.query pre_meeting
research.topic context_assembly
review_feedback review
draft document
opportunity.scan opportunity
email.compose email
content.publish content_pipeline

Response:

{
  "content": "## Morning Briefing...",
  "sources": [{"type": "briefing", "scope": "daily", "open_tasks": 12}],
  "actions_taken": ["Generated daily briefing"],
  "model_used": "glm-4-7-flash",
  "latency_ms": 2341
}

GET /health

Liveness probe. Returns 200 when the process is up.

{"status": "ok", "service": "haiven-agent-briefing"}

GET /metrics

Prometheus metrics endpoint. Auto-discovered via Docker labels (prometheus.scrape=true, prometheus.port=8000, prometheus.path=/metrics).


Scope Internals

daily

  1. GET /api/v1/tasks?status=open&limit=20 from work-hub
  2. GET /api/v1/tasks?status=done&limit=10 from work-hub
  3. POST /v1/search with {"query": "recent updates", "limit": 10} from haiven-knowledge
  4. Load template from Langfuse (briefing-daily); fall back to bundled hardcoded template
  5. Fill {{open_tasks}}, {{completed_tasks}}, {{recent_kb}} variables
  6. Call LiteLLM /v1/chat/completions with GLM-4.7-Flash
  7. If notify=true, POST to notification-hub
  8. If extract_tasks=true, make second LLM call for task JSON, POST each to work-hub with dedup check

voice_instructions is injected as a system message before the filled template.

end_of_day

Same structure as daily but uses template briefing-eod, queries completed tasks (limit 20) + open tasks (limit 10), skips KB search. Template variable: {{task_counts}}.

review

  1. GET /api/v1/tasks/{task_id} from work-hub
  2. POST /v1/search with voice_instructions as query, limit 5
  3. Construct prompt: current artifact + voice instructions + KB snippets
  4. Call LiteLLM
  5. PATCH /api/v1/tasks/{task_id} with context field containing revised artifact

Requires both task_id and voice_instructions.

context_assembly

  1. GET /api/v1/tasks/{task_id} from work-hub
  2. Run 4 parallel KB queries via asyncio.gather:
    - "meeting notes about {topic}"
    - "research on {topic}"
    - "reference for {topic}"
    - "AI conversations about {topic}"
  3. Deduplicate results by point_id
  4. Build structured assembled_context JSON
  5. PATCH /api/v1/tasks/{task_id} with context and status = "context_ready"

pre_meeting

APScheduler polls every 15 minutes. On each tick:

  1. Search KB for "calendar events next 60 minutes" filtered to doc_type = "calendar-event"
  2. Skip already-notified event UIDs (in-memory TTL cache, 2-hour window; Redis for persistence)
  3. For each new event, run KB context search on "context for meeting: {title}"
  4. Build LLM prompt with event details + KB context per meeting
  5. Generate briefing via GLM-4.7-Flash (thinking disabled for speed)
  6. Push to notification-hub
  7. Mark event UIDs as notified

Enabled by BRIEFING_PRE_MEETING_ENABLED=true. Can also be triggered manually via POST /briefing with scope=pre_meeting.

document

7-step structured draft pipeline in draft_agent.py:

  1. Fetch task from work-hub
  2. Determine template (from template field or "default")
  3. Assemble KB context via assemble_context() (same engine as context_assembly)
  4. Load seed template from /mnt/storage/templates/
  5. Generate draft via qwen3.5-27b (thinking disabled)
  6. Save draft to work-hub task artifact
  7. Optionally notify

Parse voice_instructions with pipe-separated keys:
- subject:My Document Topic
- client:Acme Corp

email

Two-model pipeline in email_composer.py:

  1. GLM-4.7-Flash classifies the email tone from subject + recipient context
  2. qwen3.5-27b composes the body using tone classification + KB context
  3. Draft saved to work-hub task artifact (if task_id provided)

Parse voice_instructions with pipe-separated keys:
- to:recipient@example.com (required)
- subject:Email subject line (required)
- style:formal (optional, overrides GLM tone classification)
- client:Acme Corp (optional, for KB filtering)

opportunity

Daily KB scan in opportunity_agent.py, scheduled via APScheduler (schedulers/opportunity.py). Also callable on-demand.

Five detection types, each with 3 KB query strings:
- blog_idea — insights worth publishing
- productization_pattern — reusable workflows that could become products
- client_follow_up — open items from client conversations
- cross_pollination — ideas applicable across domains
- competitive_intel — market trends and competitor signals

Pipeline:
1. Load config from config/opportunity.yml (or safe defaults)
2. Run all detection-type queries in parallel
3. Deduplicate by point_id, score by relevance
4. Cache results to Redis for fast retrieval in subsequent /briefing calls
5. Format ranked opportunity list with urgency scores
6. Notify via notification-hub (if notify=true and opportunities found)

content_pipeline

Multi-branch publishing pipeline in content_pipeline.py:

Blog branch (qwen3.5-27b, thinking disabled):
1. Fetch source task from work-hub
2. Assemble KB context via assemble_context()
3. Generate 600+ word blog post in Markdown
4. Convert Markdown to HTML
5. Publish to WordPress as draft via REST API (Basic Auth)
6. Fallback to /mnt/storage/drafts/ if WP credentials absent or unreachable

Social branch (GLM-4.7-Flash, thinking disabled):
- LinkedIn: 200–300 word professional post, max 3 hashtags
- X/Twitter: max 280-character tweet, hard-truncated as safety net
- Written to /mnt/storage/social/

Podcast branch (chatterbox-tts):
- Strip Markdown from blog body to spoken text
- TTS via chatterbox-tts (rosie-perez voice)
- ffmpeg post-processing via process_episode.sh
- RSS feed regeneration via generate_feed.py
- Episode sidecar JSON written alongside MP3

Post folder assembly: All artifacts (blog.md, social-linkedin.txt, social-x.txt, podcast.mp3 symlink, featured-image.png 1200x630, social-image.png 1080x1080, manifest.json) written to /mnt/storage/content-output/{date}-{slug}/.

Parse voice_instructions with pipe-separated keys:
- targets:blog,social_linkedin,social_x,podcast (comma-separated list)
- tone:professional (writing tone for LLM prompts)
- audience:general (target audience description)
- client:Acme Corp (optional, for KB filtering)


Configuration

All environment variables use the BRIEFING_ prefix (Pydantic env_prefix).

Variable Default Description
BRIEFING_LITELLM_URL http://litellm:4000 LiteLLM gateway URL
BRIEFING_LITELLM_API_KEY "" LiteLLM API key
BRIEFING_API_KEY None Optional Bearer token for this service's own API
BRIEFING_BRIEFING_MODEL glm-4-7-flash Default model for briefing LLM calls
BRIEFING_SEED_MODEL seed-oss-36b Model used for document drafts, blog body, email body
BRIEFING_GLM_MODEL glm-4-7-flash Model used for tone classification and social snippets
BRIEFING_WORKHUB_URL http://work-hub:8030 work-hub backend URL
BRIEFING_KNOWLEDGE_URL http://haiven-knowledge:8022 haiven-knowledge URL
BRIEFING_NOTIFICATION_HUB_URL http://notification-hub:8000 notification-hub URL
BRIEFING_REDIS_URL redis://redis:6379 Redis for pre-meeting idempotency + opportunity cache
BRIEFING_PRE_MEETING_ENABLED true Enable APScheduler for pre-meeting polling
BRIEFING_LANGFUSE_PUBLIC_KEY "" Langfuse public key (prompt management + tracing)
BRIEFING_LANGFUSE_SECRET_KEY "" Langfuse secret key
BRIEFING_LANGFUSE_HOST http://langfuse-web:3000 Langfuse host URL
BRIEFING_WP_SITE_URL https://www.elijah.ai WordPress site URL for blog publishing
BRIEFING_WP_API_USER "" WordPress API username (Basic Auth)
BRIEFING_WP_API_TOKEN "" WordPress application password
BRIEFING_TTS_URL http://chatterbox-tts:8004 chatterbox-tts URL for podcast branch
BRIEFING_PODCAST_INTRO_PATH /mnt/storage/podcast/assets/intro.mp3 Podcast intro asset (auto-generated via TTS if missing)
BRIEFING_PODCAST_OUTRO_PATH /mnt/storage/podcast/assets/outro.mp3 Podcast outro asset
BRIEFING_PODCAST_TTS_VOICE rosie-perez TTS voice for podcast and podcast asset generation
BRIEFING_CONTENT_OUTPUT_DIR /mnt/storage/content-output Root directory for post folder artifacts

GLM Response Parsing

GLM-4.7-Flash has thinking mode ON by default. The final answer lands in reasoning_content and content is None. Handled automatically in app/llm.py:

if "glm" in model.lower():
    content = msg.get("content")
    if content is not None:
        return content        # thinking OFF — answer in content
    return msg.get("reasoning_content") or ""  # thinking ON — answer in reasoning_content

For task extraction, pre-meeting, email tone classification, and social snippet generation, thinking is explicitly disabled via extra_body={"enable_thinking": False} for speed and JSON parsing reliability.

For qwen3.5-27b (document drafts, blog body), thinking is disabled via chat_template_kwargs={"thinking_budget": 0}.


Langfuse Templates

The service fetches prompt templates from Langfuse at runtime. If credentials are absent or the request fails, it silently falls back to hardcoded defaults in app/templates.py.

Template Name Scope Variables
briefing-daily daily {{open_tasks}}, {{completed_tasks}}, {{recent_kb}}
briefing-eod end_of_day {{task_counts}}

Tracing spans for all LLM calls are emitted via app/tracing.py (fire-and-forget). 8 span types including generate_body, wp_publish, social_generate, podcast_tts, podcast_ffmpeg, podcast_feed_regen, context_assembly.


Scheduled Triggers

systemd timers (daily + EOD briefings)

Timer Schedule Scope Template notify
haiven-briefing-morning.timer Mon–Fri 07:30 daily briefing-daily true
haiven-briefing-eod.timer Mon–Fri 17:30 end_of_day briefing-eod true

Both timers include a 2-minute randomized delay (RandomizedDelaySec=120).

APScheduler (in-process)

Scheduler Interval Scope Notes
pre_meeting Every 15 min pre_meeting Polls KB for upcoming calendar events
opportunity Daily opportunity Scans KB, caches results to Redis

Dependencies

Service Purpose URL
work-hub (8030) Fetch/create/patch tasks and artifacts http://work-hub:8030
haiven-knowledge (8022) KB search for all scopes http://haiven-knowledge:8022
LiteLLM (4000) All LLM calls (GLM + Seed-36B) http://litellm:4000
notification-hub (8000) Push notifications when notify: true http://notification-hub:8000
Redis (6379) Pre-meeting idempotency + opportunity cache redis://redis:6379
Langfuse Prompt template management + trace spans http://langfuse-web:3000
chatterbox-tts (8004) Podcast TTS audio generation http://chatterbox-tts:8004
WordPress (external) Blog draft publishing https://www.elijah.ai

Deployment

# Build and start
cd /mnt/apps/docker/ai/agent-briefing
docker compose up -d --build

# View logs
docker logs -f agent-briefing

# Restart
docker compose restart agent-briefing

Manual trigger examples

# Daily briefing (no notification)
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "daily", "template": "briefing-daily", "notify": false}'

# EOD briefing with notification
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "end_of_day", "notify": true}'

# Review a task artifact
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "review", "task_id": "<uuid>", "voice_instructions": "Make it shorter and more actionable"}'

# Assemble context for a task
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "context_assembly", "task_id": "<uuid>"}'

# Generate a document draft
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "document", "task_id": "<uuid>", "voice_instructions": "subject:Q1 Strategy|client:Acme"}'

# Compose an email
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "email", "voice_instructions": "to:client@example.com|subject:Project Update|style:formal"}'

# Run opportunity scan
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "opportunity", "notify": true}'

# Publish blog + social from a task
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "content_pipeline", "task_id": "<uuid>", "voice_instructions": "targets:blog,social_linkedin|tone:professional|audience:technical"}'

# Daily briefing with task extraction
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "daily", "extract_tasks": true, "notify": false}'

Timer management

# Check timer status
systemctl status haiven-briefing-morning.timer
systemctl status haiven-briefing-eod.timer
systemctl list-timers haiven-briefing*

# View timer logs
journalctl -u haiven-briefing-morning.service -n 50
journalctl -u haiven-briefing-eod.service -n 50

# Trigger manually
sudo systemctl start haiven-briefing-morning.service

Resource Limits

Limit Value
CPU (max) 1 core
CPU (reserved) 0.25 core
Memory (max) 512 MB
Memory (reserved) 128 MB

Observability


Source Layout

src/haiven-agent-briefing/
├── app/
   ├── main.py                  # FastAPI app, all 9 scope handlers, Agent Protocol endpoint
   ├── config.py                # Pydantic Settings (BRIEFING_* env vars)
   ├── llm.py                   # LiteLLM wrapper with GLM reasoning_content fix
   ├── templates.py             # Langfuse template loader with hardcoded fallbacks
   ├── notifier.py              # notification-hub client
   ├── tracing.py               # Langfuse span helpers (fire-and-forget)
   ├── draft_agent.py           # 7-step document draft pipeline (Seed-36B)
   ├── email_composer.py        # GLM tone classification + Seed-36B body composition
   ├── content_pipeline.py      # Blog + social + podcast publishing pipeline
   ├── opportunity_agent.py     # KB scan + scoring for opportunity detection
   ├── context_assembly.py      # Shared KB context assembly used by multiple scopes
   ├── data_sources/
      ├── workhub.py           # work-hub task API client
      └── knowledge.py         # haiven-knowledge search client
   └── schedulers/
       ├── pre_meeting.py       # APScheduler 15-min polling for calendar events
       └── opportunity.py       # APScheduler daily opportunity scan
├── config/
   └── opportunity.yml          # Opportunity detection config (types, thresholds)
├── Dockerfile
└── requirements.txt

Storage paths

Path Purpose
/mnt/storage/templates/ Seed document templates (4 templates)
/mnt/storage/drafts/ Local Markdown drafts (WP fallback)
/mnt/storage/social/ Social snippet text files
/mnt/storage/content-output/ Unified post folders (blog + social + images + podcast)
/mnt/storage/podcast/episodes/ Podcast MP3 episodes
/mnt/storage/podcast/assets/ Intro/outro MP3 assets (auto-generated via TTS)
/mnt/storage/podcast/feed.xml RSS feed (regenerated after each episode)