← Back to API Documentation Home

vllm-glm-flash

vLLM always-on GLM-4.7-Flash AWQ service with 200K context for fast mechanical and code-adjacent tasks.