Files
xc-llm-ascend/docs/source/user_guide/feature_guide
Wang Kunpeng e3a2443c3a [main][Doc] add mla pertoken quantization FAQ (#2018)
### What this PR does / why we need it?
When using deepseek series models generated by the --dynamic parameter,
if torchair graph mode is enabled, we should modify the configuration
file in the CANN package to prevent incorrect inference results.

- vLLM version: v0.10.0
- vLLM main:
7728dd77bb

---------

Signed-off-by: Wang Kunpeng <1289706727@qq.com>
2025-07-27 08:47:51 +08:00
..
2025-07-10 14:26:59 +08:00
2025-07-15 11:52:38 +08:00