Files
xc-llm-ascend/vllm_ascend
wangxiaoteng888 4adc6a68f5 [BugFix][P/D][0.18.0]bugfix short squence has no respone (#8142)
### What this PR does / why we need it?
bugfix short squence has no respone. This pull request refactors the
event handling for KV cache reshaping in mla_v1.py by centralizing the
reshape_cache_event creation and recording within the _mla_preprocess
function, ensuring it covers both decode and prefill operations.

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
2026-04-12 23:25:01 +08:00
..
2026-03-21 16:05:38 +08:00
2026-03-19 14:27:27 +08:00