Files
xc-llm-ascend/vllm_ascend
Mengqing Cao 449f8f65a7 [KV-Sharing] Support KV-Sharing feature in CLA models (#4138)
### What this PR does / why we need it?
Support KV-Sharing feature in CLA (cross layer attention) models, which
sharing kv cache in some layers.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
2025-12-23 10:48:31 +08:00
..
2025-12-20 17:03:25 +08:00
2025-12-02 22:10:52 +08:00
2025-12-11 18:45:43 +08:00
2025-12-02 17:35:47 +08:00