Haiven API Documentation

flowise

AI workflow automation and orchestration

ai workflows automation

Interactive Docs OpenAPI Spec Open Service

README User Guide

upload-service

Web file manager for uploading AI models and browsing storage

files upload storage models ai

Interactive Docs OpenAPI Spec Open Service

README User Guide

mcp-server

MCP protocol server with embedded Whisper STT, 23 tools, and OpenAI-compatible audio API

ai mcp tools docker monitoring stt whisper audio

Interactive Docs OpenAPI Spec Open Service

README User Guide

litellm-mcp

MCP protocol server wrapping LiteLLM proxy for local LLM calls (GLM-4.7-Flash, Seed-36B) without Claude API quota

ai mcp llm tools litellm inference glm seed

Interactive Docs OpenAPI Spec Open Service

README

crawl4ai

AI-optimized web scraping with JavaScript rendering, LLM-friendly markdown, and RAG pipeline integration

ai web-scraping crawling markdown rag playwright

Interactive Docs OpenAPI Spec Open Service

README User Guide

llama-swap

On-demand GGUF model gateway with OpenAI-compatible chat/completions APIs.

llm ai chat openai completions gguf

Interactive Docs OpenAPI Spec Open Service

README User Guide

vllm-glm-flash

vLLM always-on GLM-4.7-Flash AWQ service with 200K context for fast mechanical and code-adjacent tasks.

llm ai chat openai vllm glm

Interactive Docs OpenAPI Spec Open Service

README User Guide

vllm-qwen35-35b

vLLM always-on Bravo service serving qwen3.6-35b-a3b with 262K context for general-purpose completion.

llm ai chat openai vllm qwen

Interactive Docs OpenAPI Spec Open Service

README

vllm-heretic

vLLM always-on Huihui-Qwen3.6-27B-abliterated-BF16 service (runtime FP8, 200K context) for creative writing and low-refusal prose.

llm ai chat openai vllm huihui creative-writing

Interactive Docs OpenAPI Spec Open Service

README

vllm-qwen3-embedding

vLLM always-on Qwen3-Embedding-4B BF16 service with 8K context for semantic embeddings. Embeddings-only — use /v1/embeddings, not /v1/chat/completions.

llm ai embeddings openai vllm qwen semantic-search

Interactive Docs OpenAPI Spec Open Service

README

vllm-qwen35-27b

vLLM always-on Qwen3.5-27B FP8 service on Delta GPU with 128K context, tool calling, and structured output. Thinking disabled — optimized for JSON generation and classification.

llm ai chat openai vllm qwen structured-output tool-calling

Interactive Docs OpenAPI Spec Open Service

README

vllm-medgemma

vLLM always-on MedGemma 27B Text IT FP8 service with 32K context for medical text comprehension, clinical reasoning, and biomedical QA. Text-only (no vision). Shared Delta GPU.

llm ai chat openai vllm medical clinical biomedical

Interactive Docs OpenAPI Spec Open Service

README

vllm-minimax-m25

vLLM on-demand MiniMax-M2.5 AWQ Q4 229B MoE service with 32K context, tensor-parallel across Bravo+Charlie GPUs, and reasoning token support. On-demand — does not auto-start on reboot.

llm ai chat openai vllm minimax moe reasoning on-demand

Interactive Docs OpenAPI Spec Open Service

README

qdrant

Vector database for embeddings, semantic retrieval, and RAG storage.

vector-database ai embeddings search qdrant rag

Interactive Docs OpenAPI Spec Open Service

README User Guide

piper-api

Fast CPU-based text-to-speech using Piper neural voices

tts ai audio speech

Interactive Docs OpenAPI Spec Open Service

README User Guide

styletts2

Advanced voice cloning and text-to-speech with StyleTTS2

tts ai audio voice-cloning

Interactive Docs OpenAPI Spec Open Service

README User Guide

f5-tts

Flow-matching text-to-speech with zero-shot voice cloning

tts ai audio voice-cloning flow-matching

Interactive Docs OpenAPI Spec Open Service

README User Guide

audio-converter

FFMPEG-based audio format conversion and processing

audio conversion ffmpeg

Interactive Docs OpenAPI Spec Open Service

README User Guide

comfyui

Stable Diffusion image generation with ComfyUI workflows (native systemd service)

images ai diffusion generation

Interactive Docs OpenAPI Spec Open Service

README User Guide

chat-export

Claude Code conversation export to LibreChat format

chat export claude librechat conversations

Interactive Docs OpenAPI Spec Open Service

README User Guide

haiven-intelligence

Semantic search API for AI conversations with vector similarity and hybrid search

search ai vector semantic qdrant conversations

Interactive Docs OpenAPI Spec Open Service

README User Guide

haiven-knowledge

Semantic knowledge base for infrastructure docs and lessons learned (744 points, 28 topics, Qdrant + Qwen3-Embed)

knowledge ai vector semantic qdrant search rag

Interactive Docs OpenAPI Spec Open Service

README User Guide

haiven-ingest-docling

Document format conversion for the Haiven ingestion pipeline — converts PDF, DOCX, PPTX, XLSX, HTML, and images to Markdown/JSON using Docling (IBM Research). Primary consumer: haiven-knowledge IngestionRouter.

document-conversion ai ingestion pdf docx ocr rag docling

Interactive Docs OpenAPI Spec Open Service

README User Guide

litellm

OpenAI-compatible API gateway with unified model access, virtual keys, and Langfuse observability

llm ai openai gateway proxy chat

Interactive Docs OpenAPI Spec Open Service

README User Guide

sandbox-manager

Web-based orchestration platform for Claude Code containers with terminal access, MCP configuration, and LLM routing

claude containers terminal mcp sandbox orchestration

Interactive Docs OpenAPI Spec Open Service

README User Guide

research-agent

Autonomous 9-state web research pipeline with LLM synthesis, SearXNG search, Crawl4AI extraction, work-hub task integration, haiven-knowledge auto-ingest, and Langfuse tracing. Supports task_id for artifact write-back.

research ai llm search crawling synthesis pipeline knowledge work-hub

Interactive Docs OpenAPI Spec Open Service

README User Guide

audiobook-recommender

Personal audiobook recommendation engine with semantic search, weighted scoring, and Libation import

recommendations ai audiobooks embeddings vector-search lancedb

Interactive Docs OpenAPI Spec Open Service

README User Guide

haiven-transcribe

Tri-engine speech-to-text with NVIDIA Canary-1b-v2, Parakeet-TDT-0.6B-v2, Whisper Large v3 Turbo, and pyannote speaker diarization

stt ai audio transcription translation diarization wyoming

Interactive Docs OpenAPI Spec Open Service

README User Guide

meeting-scribe

Automated meeting transcription and note-generation pipeline (7-stage: transcribe → clean → infer metadata → notes → validate → extract → deliver) with v2 edit mode, partial re-runs, and 31 configurable settings

transcription ai meetings notes pipeline llm

Interactive Docs OpenAPI Spec Open Service

README User Guide

vllm-gemma4-26b

Gemma 4 26B-A4B FP8 MoE inference via vLLM. Vision + tool calling + thinking mode. 256K context with tiny KV cache (5.2 GB at full context). Primary structured output model on Delta GPU.

llm ai vllm openai-api gpu vision tool-calling moe structured-output

Interactive Docs OpenAPI Spec Open Service

README User Guide

vllm-gemma4-e4b

Gemma 4 E4B BF16 inference via vLLM. Only Gemma 4 model with audio input (30s clips, 16 kHz). Also supports image and video. 128K context.

llm ai vllm openai-api gpu audio vision multimodal

Interactive Docs OpenAPI Spec Open Service

README User Guide

vllm-medgemma-4b

MedGemma 4B BF16 abliterated medical vision inference via vLLM. SigLIP encoder for radiology, dermatology, histopathology, ophthalmology. 128K context.

llm ai vllm openai-api gpu medical vision radiology

Interactive Docs OpenAPI Spec Open Service

README User Guide

meeting-assistant

AI-powered meeting assistant with SSE streaming chat, reasoning blocks, real-time transcription via haiven-transcribe, speaker diarization, knowledge base search, and chat export

meetings ai transcription chat sse diarization knowledge stt

Interactive Docs OpenAPI Spec Open Service

README User Guide

work-hub

Integration-first workspace — IMAP email, meeting transcription, document import (PDF/DOCX/EML/HTML/CSV), task management, and AI-assisted drafting. 37 endpoints across tasks, meetings, taxonomy, import, audio transcription, email connector, backfill, and webhooks.

tasks ai meetings rag drafting import email webhooks productivity imap

Interactive Docs OpenAPI Spec Open Service

README User Guide

content-factory

Voice-to-content pipeline with 6 content types (Entry, TIL, Link, Quote, Build, Site Page), 2 voice registers (Authority, Conversational), and a spoken feedback loop. Records voice notes, transcribes via haiven-transcribe, drafts via Seed-36B, saves as markdown with YAML frontmatter.

content ai voice pipeline transcription markdown drafting

Interactive Docs OpenAPI Spec Open Service

README User Guide

haiven-ragas

RAGAS evaluation service for the haiven-knowledge RAG pipeline. Measures retrieval quality (Context Precision) against a 25-question golden dataset using GLM-4.7-Flash as judge. Quality gate: >= 0.65.

evaluation ai rag ragas quality knowledge

Interactive Docs OpenAPI Spec Open Service

README User Guide

haiven-reranker

Cross-encoder reranking service for the Haiven knowledge pipeline. Scores (query, passage) pairs using Qwen3-Reranker-4B-seq-cls via sentence-transformers CrossEncoder on the Delta GPU (port 8460). Internal service — no public domain. OFFLINE / on-demand — stopped 2026-03-30.

reranking ai rag search cross-encoder gpu embeddings

Interactive Docs OpenAPI Spec Open Service

README User Guide

notification-hub

Multi-channel notification dispatcher for Haiven agent services. Routes notifications to email (SMTP via Mailpit), ntfy.sh push, or Home Assistant TTS based on a YAML routing table keyed by source_agent.

notifications ai email ntfy ha-tts smtp agents

Interactive Docs OpenAPI Spec Open Service

README User Guide

agent-briefing

Scope-aware briefing agent — generates daily summaries, end-of-day reports, task artifact reviews, and knowledge context assembly. Pulls from work-hub and haiven-knowledge, generates via GLM-4.7-Flash, delivers through notification-hub. Scheduled Mon–Fri via systemd timers.

briefing ai agents tasks knowledge scheduled notifications

Interactive Docs OpenAPI Spec Open Service

README User Guide

haiven-orchestrator

Central AI orchestrator — deployed intent classification via gemma4-26b, session management (Redis), and agent dispatch for 17 intents across briefing, email, scheduling, research, tasks, and knowledge domains

orchestration ai intent-classification agents llm session dispatch

Interactive Docs OpenAPI Spec Open Service

README User Guide

haven-voice-gateway

Full-duplex voice pipeline gateway — sequences STT (haiven-transcribe), intent classification (haiven-orchestrator), and TTS (haven-tts-gateway) into a single voice interaction. Includes confirm flow for voice-driven action approval.

voice ai audio stt tts pipeline gateway

Interactive Docs OpenAPI Spec Open Service

README User Guide

Echo (LibreChat)

AI chat frontend with multi-provider support

chat ai frontend librechat conversations

Interactive Docs OpenAPI Spec Open Service

README

flowise

upload-service

mcp-server

litellm-mcp

crawl4ai

llama-swap

vllm-glm-flash

vllm-qwen35-35b

vllm-heretic

vllm-qwen3-embedding

vllm-qwen35-27b

vllm-medgemma

vllm-minimax-m25

qdrant

piper-api

styletts2

f5-tts

audio-converter

comfyui

chat-export

haiven-intelligence

haiven-knowledge

haiven-ingest-docling

litellm

sandbox-manager

research-agent

audiobook-recommender

haiven-transcribe

meeting-scribe

vllm-gemma4-26b

vllm-gemma4-e4b

vllm-medgemma-4b

meeting-assistant

work-hub

content-factory

haiven-ragas

haiven-reranker

notification-hub

agent-briefing

haiven-orchestrator

haven-voice-gateway

Echo (LibreChat)

portainer

prometheus

loki

alertmanager

grafana

uptime-kuma

searxng

cronicle

remediation-approval

traefik

sonarr

radarr

lidarr

readarr

prowlarr

whisparr

bazarr

jellyfin

video-downloader

qbittorrent

vikunja

memos