xc-llm-ascend

Files

zzzzwwjj 4df8df5b94 [bugfix] fix deepseek rope sincoscache re-generation (#2744 )

### What this PR does / why we need it?
The current implementation will result in duplicate generation of
`sin_cos_cache` in rope when `kv_seqlen` > 4k, because the
initialization length of the `sin_cos_cache` is only 4k.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
After this PR merged, sin_cos_cache will not increase in forward func,
so `test_native_rope_deepseek_forward_cache_handling` is not necessary.

- vLLM version: v0.10.1.1
- vLLM main:
60f0843ef8

Signed-off-by: zzzzwwjj <1183291235@qq.com>

2025-09-08 22:03:34 +08:00

e2e

[main] [refactor] refactor common_fused_moe.py (#2706 )

2025-09-08 20:09:50 +08:00

[bugfix] fix deepseek rope sincoscache re-generation (#2744 )

2025-09-08 22:03:34 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00