[Doc] Support kimi-k2-w8a8 (#2162)

### What this PR does / why we need it? In fact, the kimi-k2 model is similar to the deepseek model, and we only need to make a few changes to support it. what does this pr do: 1. Add kimi-k2-w8a8 deployment doc 2. Update quantization doc 3. Upgrade torchair support list ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.0 - vLLM main: 9edd1db02b --------- Signed-off-by: wangli <wangli858794774@gmail.com>
2025-08-06 19:28:47 +08:00
parent 875a86cbe9
commit bf84f2dbfa
8 changed files with 194 additions and 40 deletions
--- a/vllm_ascend/ascend_config.py
+++ b/vllm_ascend/ascend_config.py
@@ -17,7 +17,7 @@ from typing import Optional

 from vllm.logger import logger

-TORCHAIR_MODEL_LIST = ["deepseek", "pangu"]
+TORCHAIR_MODEL_LIST = ["deepseek", "pangu", "kimi_k2"]


 def _check_torchair_supported(model_type: str):