Files
xc-llm-ascend/vllm_ascend
Mengqing Cao 5fed166a99 [ModelRunner][Refactor] Refactor kv cache tensor initialization logic (#3106)
### What this PR does / why we need it?
Refactor kv cache tensor initialization logic. 
1. Unify the kvcache tensor initialization logic of deepseek and normal
models
2. spilt `initialize_kv_cache_tensors` into `_allocate_kv_cache_tensors`
and `_reshape_kv_cache_tensors`, following gpu modelrunner in vllm

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.
1. prefill disaggregation scenario
4. deepseek + aclgraph/eager mode
5. qwen3 next


- vLLM version: v0.11.0
- vLLM main:
83f478bb19

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-11-04 17:26:54 +08:00
..
2025-10-25 15:36:32 +08:00
2025-10-25 15:53:01 +08:00