Audiobook Recommendation Engine

Personal audiobook recommendation system with semantic search, weighted scoring, and external discovery

FastAPI
React
LanceDB

Access: https://audiobook-recommender.haiven.local/

Overview

The Audiobook Recommendation Engine is a self-hosted application that provides intelligent audiobook recommendations from your personal Libation library exports. It uses semantic embeddings with BGE-small-en-v1.5 and a multi-factor weighted scoring algorithm to help you decide what to listen to next.

Key Features

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   React/Vite    │────▶│   FastAPI       │────▶│   LanceDB       │
│   Frontend      │     │   Backend       │     │   (Embedded)    │
│   (Port 80)     │     │   (Port 8000)   │     │                 │
└─────────────────┘     └────────┬────────┘     └─────────────────┘
                                 │
                    ┌────────────┼────────────┐
                    ▼            ▼            ▼
              ┌──────────┐ ┌──────────┐ ┌──────────┐
              │ Hardcover│ │  Open    │ │  Ollama  │
              │   API    │ │ Library  │ │  (Opt.)  │
              └──────────┘ └──────────┘ └──────────┘

Quick Start

Prerequisites

Deploy

cd /mnt/apps/docker/ai/audiobook-recommender

# Configure environment (optional - for Hardcover API)
cp .env.example .env
# Edit .env to add HARDCOVER_API_KEY if desired

# Start services
docker compose up -d

Verify

# Check container status
docker logs -f audiobook-backend

# Test health endpoint
curl https://audiobook-recommender.haiven.local/api/health

# Check library stats
curl https://audiobook-recommender.haiven.local/api/stats

Access the application at: https://audiobook-recommender.haiven.local/

API Endpoints

Books & Library

Endpoint Method Description
GET /api/books GET List books with pagination and filters
GET /api/books/{id} GET Get single book details
POST /api/import POST Import Libation Excel export
GET /api/stats GET Library statistics

Recommendations

Endpoint Method Description
GET /api/recommend/backlog GET Weighted recommendations from unfinished books
GET /api/recommend/similar/{id} GET Find similar books via vector search
POST /api/recommend/chat POST RAG-based conversational recommendations (optional)

Discovery

Endpoint Method Description
GET /api/discover/{id} GET Discover related books from external APIs

Health & Metrics

Endpoint Method Description
GET /health GET Health check
GET /metrics GET Prometheus metrics

Recommendation Algorithm

The recommendation engine uses a weighted scoring system combining multiple factors:

Factor Weight Description
Content Match 30% Category and description similarity
Series Priority 25% Unfinished series completion
Rating Score 20% Community and personal ratings
Embedding Similarity 25% Semantic similarity via BGE vectors

Series Completion Logic

Configuration

Environment Variables

Variable Default Description
LANCEDB_PATH /data/lancedb Vector database storage path
EMBEDDING_MODEL BAAI/bge-small-en-v1.5 Sentence transformer model
EMBEDDING_DIMENSION 384 Vector dimensions
WEIGHT_CONTENT 0.30 Content matching weight
WEIGHT_SERIES 0.25 Series priority weight
WEIGHT_RATING 0.20 Rating score weight
WEIGHT_EMBEDDING 0.25 Embedding similarity weight
HARDCOVER_API_KEY (optional) Hardcover API token for discovery

Storage Locations

Path Purpose
/mnt/storage/audiobook-recommender/data LanceDB database, imports
/mnt/storage/audiobook-recommender/cache Model cache, API response cache
/mnt/storage/audiobook-recommender/logs Application logs

Project Structure

/mnt/apps/docker/ai/audiobook-recommender/
├── docker-compose.yml          # Container orchestration
├── .env.example                 # Environment template
├── .env                         # Local configuration
├── README.md                    # This file
└── USER_GUIDE.md               # End-user documentation

/mnt/apps/src/audiobook-recommender/
├── backend/
   ├── main.py                  # FastAPI application
   ├── config.py                # Configuration management
   ├── models.py                # Pydantic models
   ├── database.py              # LanceDB operations
   ├── embeddings.py            # BGE model wrapper
   ├── importer.py              # Libation Excel parser
   ├── recommender.py           # Weighted scoring algorithm
   ├── hardcover_client.py      # Hardcover GraphQL client
   ├── openlibrary_client.py    # Open Library REST client
   ├── Dockerfile               # Backend container
   └── requirements.txt         # Python dependencies
└── frontend/
    ├── src/
       ├── App.tsx              # Main application
       ├── components/          # UI components
       ├── pages/               # Route pages
       └── lib/api.ts           # API client
    ├── Dockerfile               # Frontend container (nginx)
    ├── package.json             # Node dependencies
    └── vite.config.ts           # Build configuration

Docker Compose Services

audiobook-backend

FastAPI backend with LanceDB and BGE embeddings.

audiobook-frontend

React frontend served via nginx with API proxy.

Monitoring

Prometheus Metrics

The backend exposes Prometheus metrics at /metrics:

Health Checks

Both containers have health checks configured:

# Backend health
curl http://localhost:8000/health

# Frontend health
curl http://localhost:80/health

Troubleshooting

Import Fails

# Check backend logs
docker logs audiobook-backend

# Verify Excel file format
# Must be Libation export with expected columns

Slow Startup

The backend requires ~120 seconds to load the BGE embedding model on first start. This is normal.

# Watch startup progress
docker logs -f audiobook-backend

Recommendations Not Working

# Check if books are imported
curl https://audiobook-recommender.haiven.local/api/stats

# Verify embeddings are generated
# Stats should show embedding count > 0

Discovery API Errors

# Check Hardcover API key (if configured)
docker exec audiobook-backend env | grep HARDCOVER

# Verify external API connectivity
docker exec audiobook-backend curl -s https://hardcover.app/api/

Memory Issues

If the backend runs out of memory during import:

# Increase memory limit in docker-compose.yml
# Current: 4G, may need 6G for large libraries

# Or import in smaller batches

Maintenance

Clear Database

# Stop containers
docker compose down

# Remove LanceDB data
rm -rf /mnt/storage/audiobook-recommender/data/lancedb

# Restart and re-import
docker compose up -d

Update Images

cd /mnt/apps/docker/ai/audiobook-recommender

# Rebuild from source
docker compose build --no-cache

# Restart
docker compose up -d

View Logs

# All containers
docker compose logs -f

# Backend only
docker logs -f audiobook-backend

# Frontend only
docker logs -f audiobook-frontend

Technology Stack

Infrastructure Integration

Uptime Kuma Monitor Configuration

Add the following monitor at https://status.haiven.local:

Setting Value
Monitor Type HTTP(s)
Friendly Name Audiobook Recommender
URL https://audiobook-recommender.haiven.local/health
Heartbeat Interval 60 seconds
Retries 3
Accepted Status Codes 200-299
Tags ai, recommendations

Alternative Backend Monitor (internal):

Setting Value
Monitor Type HTTP(s)
Friendly Name Audiobook Backend
URL http://audiobook-backend:8000/health
Heartbeat Interval 30 seconds

Grafana Dashboard

Create a Grafana dashboard for monitoring at https://grafana.haiven.local:

Recommended Panels:

  1. Library Overview (Stat)
    - Query: audiobook_books_total
    - Title: "Total Books"

  2. Backlog Size (Stat)
    - Query: audiobook_books_total - audiobook_books_finished
    - Title: "Books to Listen"

  3. Recommendation Requests (Time Series)
    - Query: rate(audiobook_recommendations_total[5m])
    - Title: "Recommendation Requests/min"

  4. Import Performance (Gauge)
    - Query: audiobook_import_duration_seconds
    - Title: "Last Import Duration"

  5. Embedding Generation (Time Series)
    - Query: rate(audiobook_embedding_duration_seconds_sum[5m])
    - Title: "Embedding Time"

  6. Container Resources (from cAdvisor)
    - CPU: rate(container_cpu_usage_seconds_total{name="audiobook-backend"}[5m])
    - Memory: container_memory_usage_bytes{name="audiobook-backend"}

Dashboard JSON Template:

The dashboard can be imported from:
/mnt/apps/docker/infrastructure/grafana/dashboards/audiobook-recommender.json

(Create this file with the Grafana dashboard export after initial setup)

Prometheus Scrape Configuration

Prometheus automatically discovers the service via Docker labels. Verify scraping:

# Check Prometheus targets
curl -s http://prometheus:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job == "audiobook-backend")'

The docker-compose.yml already includes the required labels:

labels:
  - "prometheus.scrape=true"
  - "prometheus.port=8000"
  - "prometheus.path=/metrics"

License

Internal Haiven infrastructure service. Not intended for external distribution.