xc-llm-ascend

Files

Yaphets24 13c7392416 [BugFix] fix dsv3.1 service failed to start (#8207 )

### What this PR does / why we need it?

This PR fixes a service startup failure for DeepSeek-V3.1 models by
removing a strict type assertion for `MLAAttentionSpec` in
`NPUModelRunner.get_kv_cache_spec`. The assertion was failing due to
class identity mismatches caused by the runtime patching of
`MLAAttentionSpec` with `AscendMLAAttentionSpec`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Verified that the service starts correctly for DSV3.1 models.

Signed-off-by: mayumeng <m30059191@china.huawei.com>
Co-authored-by: mayumeng <m30059191@china.huawei.com>

2026-04-14 17:52:55 +08:00

[v0.18.0][CI] Fix releases/v0.18.0 ci test only support vllm v0.18.0 (#7686 )

2026-03-26 18:36:04 +08:00

__init__.py

[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878 )

2025-07-19 09:42:32 +08:00

block_table.py

[Hybrid] support prefix cache for Qwen3.5/Next with --mamba-cache-mode align (#7103 )

2026-03-15 09:44:09 +08:00

model_runner_v1.py

[BugFix] fix dsv3.1 service failed to start (#8207 )

2026-04-14 17:52:55 +08:00