Video Downloader (yt-dlp-web-ui)

Production-grade video downloading service using yt-dlp-web-ui with VPN geo-bypass, category page scraping, JavaScript runtime support, and comprehensive fallback options.

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
                         Gluetun VPN Container                               
                       (ExpressVPN  Canada)                                 
  ┌──────────────────────────────────────────────────────────────────────┐   
                     Shared Network Namespace                               
    ┌─────────────────────────┐    ┌─────────────────────────┐             
       video-downloader             category-scraper                   
       (yt-dlp-webui)        │◄───│   (Playwright)                       
       Port 3033                    Port 3034                          
    └─────────────────────────┘    └─────────────────────────┘             
  └──────────────────────────────────────────────────────────────────────┘   
  Exposed Ports: 3033 (downloader), 3034 (scraper)                           
└─────────────────────────────────────────────────────────────────────────────┘
                                      
                  ┌───────────────────┴───────────────────┐
                                                         
┌─────────────────────────────────┐    ┌─────────────────────────────────┐
  Prometheus Exporter (9105)           Downloads  NAS Storage        
  video-downloader-exporter            /mnt/nas1/media/videos/ytdlp   
└─────────────────────────────────┘    └─────────────────────────────────┘

Access

Service URL Purpose
Downloader UI https://downloader.haiven.local Download videos
Category Scraper https://scraper.haiven.local Extract category/model pages
External Access https://downloader.haiven.site Remote access (auth required)
OpenAPI Docs https://downloader.haiven.local/openapi API documentation
Metrics http://localhost:9105/metrics Prometheus scraping

Documentation

Quick Start

cd /mnt/apps/docker/utils/video-downloader
cp .env.example .env
# Edit .env with your settings (optional)
docker compose up -d

Storage

Path Purpose
/mnt/nas1/media/videos/ytdlp Downloaded videos
/mnt/nas1/media/audio/ytdlp Extracted audio files
./data Queue database
./cookies Browser cookies for authenticated sites
./config Custom configuration
./logs Application logs

Features

Default Arguments

All downloads automatically include these optimizations (configured via YTDLP_ARGS environment variable):

--no-check-certificate       # Bypass SSL verification for compatibility
--retries 10                 # Retry failed downloads 10 times
--fragment-retries 10        # Retry failed fragments 10 times
--retry-sleep 5              # Wait 5 seconds between retries
--file-access-retries 10     # Retry file access 10 times
--extractor-retries 5        # Retry extractor detection 5 times
--embed-metadata             # Embed video metadata (title, artist, etc.)
--embed-thumbnail            # Embed thumbnail as cover art
--restrict-filenames         # Use ASCII-only safe filenames
--geo-bypass                 # Bypass geographic restrictions
--force-generic-extractor    # Try generic extraction as fallback
--js-runtimes node           # Enable JavaScript decryption (Node.js)
--yes-playlist               # Download full playlists/categories

These defaults ensure maximum compatibility and reliability across all sites.

VPN Geo-Bypass (Gluetun)

All traffic routes through ExpressVPN (Canada) to bypass geographic restrictions:

Bypassed restrictions:
- PornHub (blocked in Utah, Virginia, Texas, Montana, North Carolina, etc.)
- Region-locked content
- Rate-limited sites

VPN Health Check:

# Check VPN IP
docker exec downloader-vpn curl -s https://ipinfo.io/ip

# Check VPN status
docker logs downloader-vpn | grep -i "vpn is ready"

Category Scraper

For pages that yt-dlp doesn't recognize as playlists (category pages, model pages, search results), use the Category Scraper:

Access: https://scraper.haiven.local

How it works:
1. Playwright renders JavaScript-heavy pages in headless Chromium
2. Auto-scrolls to load infinite scroll content
3. Extracts all video URLs using regex patterns
4. Queues each video to the downloader via API

Supported page types:

Type Example Status
Category page /categories/ebony ✅ Scraper
Model page /model/perfectdime ✅ Scraper
Search results /video/search?q=xxx ✅ Scraper
User playlist /playlist/123 ✅ Downloader direct
Single video /view_video.php?viewkey=xxx ✅ Downloader direct

API Usage:

# Preview URLs (don't download)
curl -X POST http://localhost:3034/preview \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.pornhub.com/categories/ebony"}'

# Scrape and queue all
curl -X POST http://localhost:3034/scrape \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.pornhub.com/model/perfectdime"}'

Fallback Chain

When downloading a video, yt-dlp follows this fallback chain:

  1. Site-specific extractor (YouTube, Vimeo, PornHub, etc.)
  2. Extractor retry (up to 5 attempts via --extractor-retries)
  3. Generic extractor (via --force-generic-extractor)
    - HTML5 <video> tags
    - HLS/m3u8 streams
    - DASH manifests
    - Direct .mp4/.webm links
  4. JavaScript decryption (via --js-runtimes node)

API Usage

Download a Video

curl -X POST "https://downloader.haiven.local/api/v1/exec" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=VIDEO_ID",
    "path": "/downloads",
    "rename": "",
    "args": ""
  }'

Extract Audio Only

curl -X POST "https://downloader.haiven.local/api/v1/exec" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=VIDEO_ID",
    "path": "/audio",
    "rename": "",
    "args": "-x --audio-format mp3 --audio-quality 0"
  }'

Check Running Downloads

curl https://downloader.haiven.local/api/v1/running | jq

For authenticated sites (age-restricted, private content):

  1. Export cookies from browser using "Get cookies.txt LOCALLY" extension
  2. Save to ./cookies/youtube.txt, ./cookies/vimeo.txt, etc.
  3. Use in downloads: --cookies /cookies/youtube.txt

See USER_GUIDE.md - Cookie Authentication for detailed instructions.

Preset Scripts

Status: Planned (see ROADMAP.md Priority 2). Scripts directory not yet implemented.

Once created, the /scripts/ directory will contain convenient wrappers for common download scenarios:

Script Purpose Usage
download-best.sh Maximum quality download ./scripts/download-best.sh "URL"
download-audio-mp3.sh Extract audio to MP3 ./scripts/download-audio-mp3.sh "URL"
download-audio-flac.sh Extract audio to FLAC (lossless) ./scripts/download-audio-flac.sh "URL"
download-with-subs.sh Video with embedded subtitles ./scripts/download-with-subs.sh "URL"
archive-channel.sh Download entire channel with tracking ./scripts/archive-channel.sh "CHANNEL_URL"
download-playlist.sh Download entire playlist ./scripts/download-playlist.sh "PLAYLIST_URL"

Current workaround: Use the API directly with custom arguments. See USER_GUIDE.md - Using Preset Scripts for API equivalents.

Scheduled Downloads / Subscriptions

Status: Planned (see ROADMAP.md Priority 5). Subscription system not yet implemented.

The planned subscription system will enable:

Current workaround: Manually trigger channel downloads with archive tracking:

curl -X POST "https://downloader.haiven.local/api/v1/exec" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/@channelname",
    "path": "/downloads",
    "args": "--download-archive /downloads/archive.txt --playlist-end 5"
  }'

Re-run this command periodically to fetch new videos. The archive file prevents re-downloads.

Monitoring

Prometheus Metrics

The service exposes metrics at https://downloader.haiven.local/metrics (port 3033, path /metrics).

Metrics available:
- HTTP request rates and latencies
- Download queue depth
- Active downloads count
- Success/failure rates

Prometheus scraping configuration:

# Configured via docker-compose.yml labels
labels:
  - "prometheus.scrape=true"
  - "prometheus.port=3033"
  - "prometheus.path=/metrics"

Grafana Dashboard

Status: Planned (see ROADMAP.md Priority 6).

A dedicated Grafana dashboard will display:
- Download count over time
- Average download duration
- Storage usage trends
- Error rate tracking
- Queue depth visualization

Current monitoring:
- Manual API checks: curl https://downloader.haiven.local/api/v1/active
- Container logs: docker logs -f video-downloader
- Uptime Kuma: Service availability at https://status.haiven.local

Health Check

# Docker health check (automated)
docker inspect video-downloader | jq '.[0].State.Health'

# Manual health verification
curl -I https://downloader.haiven.local

Maintenance

View Logs

docker logs -f video-downloader

Restart Service

docker compose restart

Update Service

docker compose pull
docker compose up -d

Update yt-dlp (if needed between image updates)

docker exec video-downloader pip install -U yt-dlp

Supported Sites

yt-dlp supports 1000+ sites including:
- YouTube, YouTube Music
- Vimeo, Dailymotion
- Twitter/X, TikTok, Instagram
- Twitch, Rumble
- Reddit, Facebook
- And many more: https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md

Quick Reference

Common Arguments

Use Case Arguments
Best quality video -f "bv*+ba/best" --merge-output-format mp4
1080p maximum -f "bv*[height<=1080]+ba"
Audio MP3 -x --audio-format mp3 --audio-quality 0
Audio FLAC -x --audio-format flac
With subtitles --write-subs --embed-subs --sub-lang en
Use cookies --cookies /cookies/youtube.txt
Archive tracking --download-archive /downloads/archive.txt
Playlist limit --playlist-end 10

API Endpoints

Endpoint Method Purpose
/api/v1/exec POST Start download
/api/v1/running GET List active downloads
/api/v1/completed GET List completed downloads
/api/v1/active GET Get queue status

Storage Paths

Type Container Path Host Path
Videos /downloads /mnt/nas1/media/videos/ytdlp
Audio /audio /mnt/nas1/media/audio/ytdlp
Cookies /cookies ./cookies (read-only)
Config /config ./config
Logs /app/logs ./logs

Resources


Generated by haiven-service-onboarding plugin | Last updated: 2025-12-06