Audiobook Recommendation Engine

Personal audiobook recommendation system with semantic search, weighted scoring, and external discovery

Access: https://audiobook-recommender.haiven.local/

Overview

The Audiobook Recommendation Engine is a self-hosted application that provides intelligent audiobook recommendations from your personal Libation library exports. It uses semantic embeddings with BGE-small-en-v1.5 and a multi-factor weighted scoring algorithm to help you decide what to listen to next.

Key Features

Smart Import - Import audiobook libraries from Libation Excel exports (1,000+ books)
Semantic Search - Vector-based similarity search using BGE embeddings (384 dimensions)
Weighted Recommendations - Multi-factor scoring combining content, series, ratings, and embeddings
Series Awareness - Prioritizes completing unfinished series
External Discovery - Find new audiobooks via Hardcover GraphQL API (Open Library fallback)
Modern UI - React 19 + Vite + TailwindCSS frontend with dark mode support

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   React/Vite    │────▶│   FastAPI       │────▶│   LanceDB       │
│   Frontend      │     │   Backend       │     │   (Embedded)    │
│   (Port 80)     │     │   (Port 8000)   │     │                 │
└─────────────────┘     └────────┬────────┘     └─────────────────┘
                                 │
                    ┌────────────┼────────────┐
                    ▼            ▼            ▼
              ┌──────────┐ ┌──────────┐ ┌──────────┐
              │ Hardcover│ │  Open    │ │  Ollama  │
              │   API    │ │ Library  │ │  (Opt.)  │
              └──────────┘ └──────────┘ └──────────┘

Quick Start

Prerequisites

Docker & Docker Compose
External networks: web, backend
Libation export file (.xlsx)

Deploy

cd /mnt/apps/docker/ai/audiobook-recommender

# Configure environment (optional - for Hardcover API)
cp .env.example .env
# Edit .env to add HARDCOVER_API_KEY if desired

# Start services
docker compose up -d

Verify

# Check container status
docker logs -f audiobook-backend

# Test health endpoint
curl https://audiobook-recommender.haiven.local/api/health

# Check library stats
curl https://audiobook-recommender.haiven.local/api/stats

Access the application at: https://audiobook-recommender.haiven.local/

API Endpoints

Books & Library

Endpoint	Method	Description
`GET /api/books`	GET	List books with pagination and filters
`GET /api/books/{id}`	GET	Get single book details
`POST /api/import`	POST	Import Libation Excel export
`GET /api/stats`	GET	Library statistics

Recommendations

Endpoint	Method	Description
`GET /api/recommend/backlog`	GET	Weighted recommendations from unfinished books
`GET /api/recommend/similar/{id}`	GET	Find similar books via vector search
`POST /api/recommend/chat`	POST	RAG-based conversational recommendations (optional)

Discovery

Endpoint	Method	Description
`GET /api/discover/{id}`	GET	Discover related books from external APIs

Health & Metrics

Endpoint	Method	Description
`GET /health`	GET	Health check
`GET /metrics`	GET	Prometheus metrics

Recommendation Algorithm

The recommendation engine uses a weighted scoring system combining multiple factors:

Factor	Weight	Description
Content Match	30%	Category and description similarity
Series Priority	25%	Unfinished series completion
Rating Score	20%	Community and personal ratings
Embedding Similarity	25%	Semantic similarity via BGE vectors

Series Completion Logic

Books in unfinished series are prioritized
Next sequential book in series ranked higher
Standalone books compete based on other factors

Configuration

Environment Variables

Variable	Default	Description
`LANCEDB_PATH`	`/data/lancedb`	Vector database storage path
`EMBEDDING_MODEL`	`BAAI/bge-small-en-v1.5`	Sentence transformer model
`EMBEDDING_DIMENSION`	`384`	Vector dimensions
`WEIGHT_CONTENT`	`0.30`	Content matching weight
`WEIGHT_SERIES`	`0.25`	Series priority weight
`WEIGHT_RATING`	`0.20`	Rating score weight
`WEIGHT_EMBEDDING`	`0.25`	Embedding similarity weight
`HARDCOVER_API_KEY`	(optional)	Hardcover API token for discovery

Storage Locations

Path	Purpose
`/mnt/storage/audiobook-recommender/data`	LanceDB database, imports
`/mnt/storage/audiobook-recommender/cache`	Model cache, API response cache
`/mnt/storage/audiobook-recommender/logs`	Application logs

Project Structure

/mnt/apps/docker/ai/audiobook-recommender/
├── docker-compose.yml          # Container orchestration
├── .env.example                 # Environment template
├── .env                         # Local configuration
├── README.md                    # This file
└── USER_GUIDE.md               # End-user documentation

/mnt/apps/src/audiobook-recommender/
├── backend/
│   ├── main.py                  # FastAPI application
│   ├── config.py                # Configuration management
│   ├── models.py                # Pydantic models
│   ├── database.py              # LanceDB operations
│   ├── embeddings.py            # BGE model wrapper
│   ├── importer.py              # Libation Excel parser
│   ├── recommender.py           # Weighted scoring algorithm
│   ├── hardcover_client.py      # Hardcover GraphQL client
│   ├── openlibrary_client.py    # Open Library REST client
│   ├── Dockerfile               # Backend container
│   └── requirements.txt         # Python dependencies
└── frontend/
    ├── src/
    │   ├── App.tsx              # Main application
    │   ├── components/          # UI components
    │   ├── pages/               # Route pages
    │   └── lib/api.ts           # API client
    ├── Dockerfile               # Frontend container (nginx)
    ├── package.json             # Node dependencies
    └── vite.config.ts           # Build configuration

Docker Compose Services

audiobook-backend

FastAPI backend with LanceDB and BGE embeddings.

Image: haiven/audiobook-recommender-backend:latest
Internal Port: 8000
Resources: 4 CPUs, 4GB RAM (embeddings require memory)
Health Check: GET /health (120s startup period for model loading)

audiobook-frontend

React frontend served via nginx with API proxy.

Image: haiven/audiobook-recommender-frontend:latest
External Port: 3020
Internal Port: 80
Traefik Domain: audiobook-recommender.haiven.local

Monitoring

Prometheus Metrics

The backend exposes Prometheus metrics at /metrics:

audiobook_books_total - Total books in library
audiobook_books_finished - Completed books count
audiobook_recommendations_total - Recommendation requests
audiobook_import_duration_seconds - Import processing time
audiobook_embedding_duration_seconds - Embedding generation time

Health Checks

Both containers have health checks configured:

# Backend health
curl http://localhost:8000/health

# Frontend health
curl http://localhost:80/health

Troubleshooting

Import Fails

# Check backend logs
docker logs audiobook-backend

# Verify Excel file format
# Must be Libation export with expected columns

Slow Startup

The backend requires ~120 seconds to load the BGE embedding model on first start. This is normal.

# Watch startup progress
docker logs -f audiobook-backend

Recommendations Not Working

# Check if books are imported
curl https://audiobook-recommender.haiven.local/api/stats

# Verify embeddings are generated
# Stats should show embedding count > 0

Discovery API Errors

# Check Hardcover API key (if configured)
docker exec audiobook-backend env | grep HARDCOVER

# Verify external API connectivity
docker exec audiobook-backend curl -s https://hardcover.app/api/

Memory Issues

If the backend runs out of memory during import:

# Increase memory limit in docker-compose.yml
# Current: 4G, may need 6G for large libraries

# Or import in smaller batches

Maintenance

Clear Database

# Stop containers
docker compose down

# Remove LanceDB data
rm -rf /mnt/storage/audiobook-recommender/data/lancedb

# Restart and re-import
docker compose up -d

Update Images

cd /mnt/apps/docker/ai/audiobook-recommender

# Rebuild from source
docker compose build --no-cache

# Restart
docker compose up -d

View Logs

# All containers
docker compose logs -f

# Backend only
docker logs -f audiobook-backend

# Frontend only
docker logs -f audiobook-frontend

Technology Stack

Backend: FastAPI 0.128.0, Python 3.11+
Vector Database: LanceDB 0.26.1 (embedded)
Embeddings: BGE-small-en-v1.5 (384 dimensions, 33M parameters)
Frontend: React 19, Vite 7.3.1, TailwindCSS 4.1.18
Data Processing: pandas, openpyxl
HTTP Client: httpx (async)

Infrastructure Integration

Uptime Kuma Monitor Configuration

Add the following monitor at https://status.haiven.local:

Setting	Value
Monitor Type	HTTP(s)
Friendly Name	Audiobook Recommender
URL	`https://audiobook-recommender.haiven.local/health`
Heartbeat Interval	60 seconds
Retries	3
Accepted Status Codes	200-299
Tags	`ai`, `recommendations`

Alternative Backend Monitor (internal):

Setting	Value
Monitor Type	HTTP(s)
Friendly Name	Audiobook Backend
URL	`http://audiobook-backend:8000/health`
Heartbeat Interval	30 seconds

Grafana Dashboard

Create a Grafana dashboard for monitoring at https://grafana.haiven.local:

Recommended Panels:

Library Overview (Stat)
- Query: audiobook_books_total
- Title: "Total Books"
Backlog Size (Stat)
- Query: audiobook_books_total - audiobook_books_finished
- Title: "Books to Listen"
Recommendation Requests (Time Series)
- Query: rate(audiobook_recommendations_total[5m])
- Title: "Recommendation Requests/min"
Import Performance (Gauge)
- Query: audiobook_import_duration_seconds
- Title: "Last Import Duration"
Embedding Generation (Time Series)
- Query: rate(audiobook_embedding_duration_seconds_sum[5m])
- Title: "Embedding Time"
Container Resources (from cAdvisor)
- CPU: rate(container_cpu_usage_seconds_total{name="audiobook-backend"}[5m])
- Memory: container_memory_usage_bytes{name="audiobook-backend"}

Dashboard JSON Template:

The dashboard can be imported from:
/mnt/apps/docker/infrastructure/grafana/dashboards/audiobook-recommender.json

(Create this file with the Grafana dashboard export after initial setup)

Prometheus Scrape Configuration

Prometheus automatically discovers the service via Docker labels. Verify scraping:

# Check Prometheus targets
curl -s http://prometheus:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job == "audiobook-backend")'

The docker-compose.yml already includes the required labels:

labels:
  - "prometheus.scrape=true"
  - "prometheus.port=8000"
  - "prometheus.path=/metrics"

User Guide: /mnt/apps/docker/ai/audiobook-recommender/USER_GUIDE.md
Implementation Plan: /mnt/apps/src/audiobook-recommender/IMPLEMENTATION-PLAN.md
Service Registry: /mnt/apps/docker/_server-info/00-SERVICES-INDEX.md
Haiven Architecture: /mnt/apps/docker/CLAUDE.md

License

Internal Haiven infrastructure service. Not intended for external distribution.