Files
enginex-mthreads-vllm/docs/deployment/integrations/kaito.md
2026-01-19 10:38:50 +08:00

395 B

KAITO

KAITO is a Kubernetes operator that supports deploying and serving LLMs with vLLM. It offers managing large models via container images with built-in OpenAI-compatible inference, auto-provisioning GPU nodes and curated model presets.

Please refer to quick start for more details.