xc-llm-ascend

Files

Yaphets24 8977be1df3 [Bugfix]Fix deepseek 3.2 C8 precision by rotary tensor (#7537 )

### What this PR does / why we need it?
During the attention quantization process of DeepSeek V3.2, it is
necessary to retrieve the Hadamard matrix from the weights to facilitate
the computation.

### Does this PR introduce _any_ user-facing change?
No. But there will be two new tensor in quant weight.

### How was this patch tested?

- vLLM version: v0.18.0
- vLLM main:
8b6325758c

---------

Signed-off-by: mayumeng <m30059191@china.huawei.com>
Co-authored-by: mayumeng <m30059191@china.huawei.com>

2026-03-25 09:18:00 +08:00

__init__.py

[Feature]Supports DSv3.1 PD separation and C8 quantization (#7222 )

2026-03-16 22:49:05 +08:00

base.py

[refactor] replace scattered business kwargs with typed request objects and explicit stage boundaries (#7024 )

2026-03-20 23:23:57 +08:00

kv_c8.py

[Bugfix]Fix deepseek 3.2 C8 precision by rotary tensor (#7537 )

2026-03-25 09:18:00 +08:00

registry.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )