haiven-agent-briefing

Core AI agent service for the Haiven platform. Provides 9 scopes covering daily intelligence briefings, document drafting, email composition, content publishing, and opportunity detection. All scopes are accessible through a single POST /briefing endpoint and a standardized Agent Protocol endpoint.

Port: 8035 (host) -> 8000 (container)
Domain: briefing.haiven.site
Health: GET /health
Metrics: GET /metrics (Prometheus auto-discovered)
Compose: /mnt/apps/docker/ai/agent-briefing/docker-compose.yml
Source: /mnt/apps/src/haiven-agent-briefing/

Scopes at a Glance

Scope	Purpose	Model	Scheduled
`daily`	Morning briefing: tasks + KB	GLM-4.7-Flash	Mon–Fri 07:30
`end_of_day`	EOD summary: completions + carry-over	GLM-4.7-Flash	Mon–Fri 17:30
`review`	Revise a task artifact with voice instructions	GLM-4.7-Flash	On-demand
`context_assembly`	4-facet parallel KB search, writes context to task	GLM-4.7-Flash	On-demand
`pre_meeting`	Meeting prep with KB context, 15-min APScheduler poll	GLM-4.7-Flash	APScheduler
`document`	7-step structured document draft via qwen3.5-27b	qwen3.5-27b	On-demand
`email`	GLM tone classification + qwen3.5-27b body composition	GLM + qwen3.5-27b	On-demand
`opportunity`	Daily KB scan for blog ideas, patterns, follow-ups	GLM-4.7-Flash	APScheduler
`content_pipeline`	Blog publish (qwen3.5-27b -> WordPress) + social snippets	qwen3.5-27b + GLM	On-demand

API Endpoints

POST /briefing

Generate a scope-specific briefing. All scopes use this single endpoint.

Request body:

{
  "scope": "daily",
  "template": "briefing-daily",
  "notify": false,
  "task_id": null,
  "voice_instructions": null,
  "extract_tasks": false
}

Field	Type	Default	Description
`scope`	string	`"daily"`	One of the 9 scopes listed above
`template`	string	`"briefing-daily"`	Langfuse prompt name; falls back to hardcoded template if unavailable
`notify`	bool	`false`	Push result to notification-hub when `true`
`task_id`	string\|null	`null`	Required for `review`, `context_assembly`, `document`, `content_pipeline`; optional for `email`
`voice_instructions`	string\|null	`null`	Scope-dependent — see per-scope detail below
`extract_tasks`	bool	`false`	After generating, make a second LLM call to extract actionable tasks and POST them to work-hub (daily scope only)

Response:

{
  "scope": "daily",
  "briefing": "Your morning briefing text...",
  "data_sources": {
    "open_tasks": 12,
    "completed_tasks": 3,
    "kb_results": 8,
    "template": "briefing-daily",
    "model": "glm-4-7-flash"
  },
  "notified": false,
  "created_task_ids": []
}

Error responses:

Status	Condition
400	Unknown scope, or required fields missing for the chosen scope
401	Bearer token missing or invalid (when `BRIEFING_API_KEY` is set)
404	`task_id` not found in work-hub
502	Document or email generation failed in upstream LLM
500	Upstream service unreachable (LiteLLM, work-hub)

POST /v1/agent

Standardized Agent Protocol endpoint. Maps intents to briefing scopes.

Request:

{
  "user_message": "Give me my morning briefing",
  "intent": "briefing.daily",
  "entities": {"instruction": "focus on high-priority items"},
  "task_id": null
}

Intent map:

Intent	Scope
`briefing.daily`	`daily`
`briefing.weekly`	`end_of_day`
`scheduling.query`	`pre_meeting`
`research.topic`	`context_assembly`
`review_feedback`	`review`
`draft`	`document`
`opportunity.scan`	`opportunity`
`email.compose`	`email`
`content.publish`	`content_pipeline`

Response:

{
  "content": "## Morning Briefing...",
  "sources": [{"type": "briefing", "scope": "daily", "open_tasks": 12}],
  "actions_taken": ["Generated daily briefing"],
  "model_used": "glm-4-7-flash",
  "latency_ms": 2341
}

GET /health

Liveness probe. Returns 200 when the process is up.

{"status": "ok", "service": "haiven-agent-briefing"}

GET /metrics

Prometheus metrics endpoint. Auto-discovered via Docker labels (prometheus.scrape=true, prometheus.port=8000, prometheus.path=/metrics).

Scope Internals

daily

GET /api/v1/tasks?status=open&limit=20 from work-hub
GET /api/v1/tasks?status=done&limit=10 from work-hub
POST /v1/search with {"query": "recent updates", "limit": 10} from haiven-knowledge
Load template from Langfuse (briefing-daily); fall back to bundled hardcoded template
Fill {{open_tasks}}, {{completed_tasks}}, {{recent_kb}} variables
Call LiteLLM /v1/chat/completions with GLM-4.7-Flash
If notify=true, POST to notification-hub
If extract_tasks=true, make second LLM call for task JSON, POST each to work-hub with dedup check

voice_instructions is injected as a system message before the filled template.

end_of_day

Same structure as daily but uses template briefing-eod, queries completed tasks (limit 20) + open tasks (limit 10), skips KB search. Template variable: {{task_counts}}.

review

GET /api/v1/tasks/{task_id} from work-hub
POST /v1/search with voice_instructions as query, limit 5
Construct prompt: current artifact + voice instructions + KB snippets
Call LiteLLM
PATCH /api/v1/tasks/{task_id} with context field containing revised artifact

Requires both task_id and voice_instructions.

context_assembly

GET /api/v1/tasks/{task_id} from work-hub
Run 4 parallel KB queries via asyncio.gather:
- "meeting notes about {topic}"
- "research on {topic}"
- "reference for {topic}"
- "AI conversations about {topic}"
Deduplicate results by point_id
Build structured assembled_context JSON
PATCH /api/v1/tasks/{task_id} with context and status = "context_ready"

pre_meeting

APScheduler polls every 15 minutes. On each tick:

Search KB for "calendar events next 60 minutes" filtered to doc_type = "calendar-event"
Skip already-notified event UIDs (in-memory TTL cache, 2-hour window; Redis for persistence)
For each new event, run KB context search on "context for meeting: {title}"
Build LLM prompt with event details + KB context per meeting
Generate briefing via GLM-4.7-Flash (thinking disabled for speed)
Push to notification-hub
Mark event UIDs as notified

Enabled by BRIEFING_PRE_MEETING_ENABLED=true. Can also be triggered manually via POST /briefing with scope=pre_meeting.

document

7-step structured draft pipeline in draft_agent.py:

Fetch task from work-hub
Determine template (from template field or "default")
Assemble KB context via assemble_context() (same engine as context_assembly)
Load seed template from /mnt/storage/templates/
Generate draft via qwen3.5-27b (thinking disabled)
Save draft to work-hub task artifact
Optionally notify

Parse voice_instructions with pipe-separated keys:
- subject:My Document Topic
- client:Acme Corp

email

Two-model pipeline in email_composer.py:

GLM-4.7-Flash classifies the email tone from subject + recipient context
qwen3.5-27b composes the body using tone classification + KB context
Draft saved to work-hub task artifact (if task_id provided)

Parse voice_instructions with pipe-separated keys:
- to:recipient@example.com (required)
- subject:Email subject line (required)
- style:formal (optional, overrides GLM tone classification)
- client:Acme Corp (optional, for KB filtering)

opportunity

Daily KB scan in opportunity_agent.py, scheduled via APScheduler (schedulers/opportunity.py). Also callable on-demand.

Five detection types, each with 3 KB query strings:
- blog_idea — insights worth publishing
- productization_pattern — reusable workflows that could become products
- client_follow_up — open items from client conversations
- cross_pollination — ideas applicable across domains
- competitive_intel — market trends and competitor signals

Pipeline:
1. Load config from config/opportunity.yml (or safe defaults)
2. Run all detection-type queries in parallel
3. Deduplicate by point_id, score by relevance
4. Cache results to Redis for fast retrieval in subsequent /briefing calls
5. Format ranked opportunity list with urgency scores
6. Notify via notification-hub (if notify=true and opportunities found)

content_pipeline

Multi-branch publishing pipeline in content_pipeline.py:

Blog branch (qwen3.5-27b, thinking disabled):
1. Fetch source task from work-hub
2. Assemble KB context via assemble_context()
3. Generate 600+ word blog post in Markdown
4. Convert Markdown to HTML
5. Publish to WordPress as draft via REST API (Basic Auth)
6. Fallback to /mnt/storage/drafts/ if WP credentials absent or unreachable

Social branch (GLM-4.7-Flash, thinking disabled):
- LinkedIn: 200–300 word professional post, max 3 hashtags
- X/Twitter: max 280-character tweet, hard-truncated as safety net
- Written to /mnt/storage/social/

Podcast branch (chatterbox-tts):
- Strip Markdown from blog body to spoken text
- TTS via chatterbox-tts (rosie-perez voice)
- ffmpeg post-processing via process_episode.sh
- RSS feed regeneration via generate_feed.py
- Episode sidecar JSON written alongside MP3

Post folder assembly: All artifacts (blog.md, social-linkedin.txt, social-x.txt, podcast.mp3 symlink, featured-image.png 1200x630, social-image.png 1080x1080, manifest.json) written to /mnt/storage/content-output/{date}-{slug}/.

Parse voice_instructions with pipe-separated keys:
- targets:blog,social_linkedin,social_x,podcast (comma-separated list)
- tone:professional (writing tone for LLM prompts)
- audience:general (target audience description)
- client:Acme Corp (optional, for KB filtering)

Configuration

All environment variables use the BRIEFING_ prefix (Pydantic env_prefix).

Variable	Default	Description
`BRIEFING_LITELLM_URL`	`http://litellm:4000`	LiteLLM gateway URL
`BRIEFING_LITELLM_API_KEY`	`""`	LiteLLM API key
`BRIEFING_API_KEY`	`None`	Optional Bearer token for this service's own API
`BRIEFING_BRIEFING_MODEL`	`glm-4-7-flash`	Default model for briefing LLM calls
`BRIEFING_SEED_MODEL`	`seed-oss-36b`	Model used for document drafts, blog body, email body
`BRIEFING_GLM_MODEL`	`glm-4-7-flash`	Model used for tone classification and social snippets
`BRIEFING_WORKHUB_URL`	`http://work-hub:8030`	work-hub backend URL
`BRIEFING_KNOWLEDGE_URL`	`http://haiven-knowledge:8022`	haiven-knowledge URL
`BRIEFING_NOTIFICATION_HUB_URL`	`http://notification-hub:8000`	notification-hub URL
`BRIEFING_REDIS_URL`	`redis://redis:6379`	Redis for pre-meeting idempotency + opportunity cache
`BRIEFING_PRE_MEETING_ENABLED`	`true`	Enable APScheduler for pre-meeting polling
`BRIEFING_LANGFUSE_PUBLIC_KEY`	`""`	Langfuse public key (prompt management + tracing)
`BRIEFING_LANGFUSE_SECRET_KEY`	`""`	Langfuse secret key
`BRIEFING_LANGFUSE_HOST`	`http://langfuse-web:3000`	Langfuse host URL
`BRIEFING_WP_SITE_URL`	`https://www.elijah.ai`	WordPress site URL for blog publishing
`BRIEFING_WP_API_USER`	`""`	WordPress API username (Basic Auth)
`BRIEFING_WP_API_TOKEN`	`""`	WordPress application password
`BRIEFING_TTS_URL`	`http://chatterbox-tts:8004`	chatterbox-tts URL for podcast branch
`BRIEFING_PODCAST_INTRO_PATH`	`/mnt/storage/podcast/assets/intro.mp3`	Podcast intro asset (auto-generated via TTS if missing)
`BRIEFING_PODCAST_OUTRO_PATH`	`/mnt/storage/podcast/assets/outro.mp3`	Podcast outro asset
`BRIEFING_PODCAST_TTS_VOICE`	`rosie-perez`	TTS voice for podcast and podcast asset generation
`BRIEFING_CONTENT_OUTPUT_DIR`	`/mnt/storage/content-output`	Root directory for post folder artifacts

GLM Response Parsing

GLM-4.7-Flash has thinking mode ON by default. The final answer lands in reasoning_content and content is None. Handled automatically in app/llm.py:

if "glm" in model.lower():
    content = msg.get("content")
    if content is not None:
        return content        # thinking OFF — answer in content
    return msg.get("reasoning_content") or ""  # thinking ON — answer in reasoning_content

For task extraction, pre-meeting, email tone classification, and social snippet generation, thinking is explicitly disabled via extra_body={"enable_thinking": False} for speed and JSON parsing reliability.

For qwen3.5-27b (document drafts, blog body), thinking is disabled via chat_template_kwargs={"thinking_budget": 0}.

Langfuse Templates

The service fetches prompt templates from Langfuse at runtime. If credentials are absent or the request fails, it silently falls back to hardcoded defaults in app/templates.py.

Template Name	Scope	Variables
`briefing-daily`	`daily`	`{{open_tasks}}`, `{{completed_tasks}}`, `{{recent_kb}}`
`briefing-eod`	`end_of_day`	`{{task_counts}}`

Tracing spans for all LLM calls are emitted via app/tracing.py (fire-and-forget). 8 span types including generate_body, wp_publish, social_generate, podcast_tts, podcast_ffmpeg, podcast_feed_regen, context_assembly.

Scheduled Triggers

systemd timers (daily + EOD briefings)

Timer	Schedule	Scope	Template	notify
`haiven-briefing-morning.timer`	Mon–Fri 07:30	`daily`	`briefing-daily`	`true`
`haiven-briefing-eod.timer`	Mon–Fri 17:30	`end_of_day`	`briefing-eod`	`true`

Both timers include a 2-minute randomized delay (RandomizedDelaySec=120).

APScheduler (in-process)

Scheduler	Interval	Scope	Notes
`pre_meeting`	Every 15 min	`pre_meeting`	Polls KB for upcoming calendar events
`opportunity`	Daily	`opportunity`	Scans KB, caches results to Redis

Dependencies

Service	Purpose	URL
work-hub (8030)	Fetch/create/patch tasks and artifacts	`http://work-hub:8030`
haiven-knowledge (8022)	KB search for all scopes	`http://haiven-knowledge:8022`
LiteLLM (4000)	All LLM calls (GLM + Seed-36B)	`http://litellm:4000`
notification-hub (8000)	Push notifications when `notify: true`	`http://notification-hub:8000`
Redis (6379)	Pre-meeting idempotency + opportunity cache	`redis://redis:6379`
Langfuse	Prompt template management + trace spans	`http://langfuse-web:3000`
chatterbox-tts (8004)	Podcast TTS audio generation	`http://chatterbox-tts:8004`
WordPress (external)	Blog draft publishing	`https://www.elijah.ai`

Deployment

# Build and start
cd /mnt/apps/docker/ai/agent-briefing
docker compose up -d --build

# View logs
docker logs -f agent-briefing

# Restart
docker compose restart agent-briefing

Manual trigger examples

# Daily briefing (no notification)
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "daily", "template": "briefing-daily", "notify": false}'

# EOD briefing with notification
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "end_of_day", "notify": true}'

# Review a task artifact
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "review", "task_id": "<uuid>", "voice_instructions": "Make it shorter and more actionable"}'

# Assemble context for a task
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "context_assembly", "task_id": "<uuid>"}'

# Generate a document draft
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "document", "task_id": "<uuid>", "voice_instructions": "subject:Q1 Strategy|client:Acme"}'

# Compose an email
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "email", "voice_instructions": "to:client@example.com|subject:Project Update|style:formal"}'

# Run opportunity scan
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "opportunity", "notify": true}'

# Publish blog + social from a task
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "content_pipeline", "task_id": "<uuid>", "voice_instructions": "targets:blog,social_linkedin|tone:professional|audience:technical"}'

# Daily briefing with task extraction
curl -sf -X POST http://localhost:8035/briefing \
  -H "Content-Type: application/json" \
  -d '{"scope": "daily", "extract_tasks": true, "notify": false}'

Timer management

# Check timer status
systemctl status haiven-briefing-morning.timer
systemctl status haiven-briefing-eod.timer
systemctl list-timers haiven-briefing*

# View timer logs
journalctl -u haiven-briefing-morning.service -n 50
journalctl -u haiven-briefing-eod.service -n 50

# Trigger manually
sudo systemctl start haiven-briefing-morning.service

Resource Limits

Limit	Value
CPU (max)	1 core
CPU (reserved)	0.25 core
Memory (max)	512 MB
Memory (reserved)	128 MB

Observability

Logs: JSON structured via structlog. docker logs agent-briefing or Loki via Grafana.
Health: GET http://localhost:8035/health — 200 = up
Prometheus: Labels on compose file enable auto-discovery. Metrics at GET http://localhost:8035/metrics
Langfuse: All LLM calls traced via LiteLLM's Langfuse integration. Span types: generate_body, wp_publish, social_generate, podcast_tts, podcast_ffmpeg, podcast_feed_regen, context_assembly. View at ai-ops.haiven.site.

Source Layout

src/haiven-agent-briefing/
├── app/
│   ├── main.py                  # FastAPI app, all 9 scope handlers, Agent Protocol endpoint
│   ├── config.py                # Pydantic Settings (BRIEFING_* env vars)
│   ├── llm.py                   # LiteLLM wrapper with GLM reasoning_content fix
│   ├── templates.py             # Langfuse template loader with hardcoded fallbacks
│   ├── notifier.py              # notification-hub client
│   ├── tracing.py               # Langfuse span helpers (fire-and-forget)
│   ├── draft_agent.py           # 7-step document draft pipeline (Seed-36B)
│   ├── email_composer.py        # GLM tone classification + Seed-36B body composition
│   ├── content_pipeline.py      # Blog + social + podcast publishing pipeline
│   ├── opportunity_agent.py     # KB scan + scoring for opportunity detection
│   ├── context_assembly.py      # Shared KB context assembly used by multiple scopes
│   ├── data_sources/
│   │   ├── workhub.py           # work-hub task API client
│   │   └── knowledge.py         # haiven-knowledge search client
│   └── schedulers/
│       ├── pre_meeting.py       # APScheduler 15-min polling for calendar events
│       └── opportunity.py       # APScheduler daily opportunity scan
├── config/
│   └── opportunity.yml          # Opportunity detection config (types, thresholds)
├── Dockerfile
└── requirements.txt

Storage paths

Path	Purpose
`/mnt/storage/templates/`	Seed document templates (4 templates)
`/mnt/storage/drafts/`	Local Markdown drafts (WP fallback)
`/mnt/storage/social/`	Social snippet text files
`/mnt/storage/content-output/`	Unified post folders (blog + social + images + podcast)
`/mnt/storage/podcast/episodes/`	Podcast MP3 episodes
`/mnt/storage/podcast/assets/`	Intro/outro MP3 assets (auto-generated via TTS)
`/mnt/storage/podcast/feed.xml`	RSS feed (regenerated after each episode)