← Back to API Documentation Home

vllm-qwen35-27b

vLLM always-on Qwen3.5-27B FP8 service on Delta GPU with 128K context, tool calling, and structured output. Thinking disabled — optimized for JSON generation and classification.