vLLM always-on Qwen3.5-27B FP8 service on Delta GPU with 128K context, tool calling, and structured output. Thinking disabled — optimized for JSON generation and classification.
Are you sure you want to perform this action?
Status
Message here