Revert [KV-Sharing] Support KV-Sharing feature in CLA models (#4138) (#5317)

### What this PR does / why we need it?
Revert [KV-Sharing] Support KV-Sharing feature in CLA models (#4138) as
it causes deepseek v3.2 hang error


- vLLM version: release/v0.13.0
- vLLM main:
5fbfa8d9ef

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
Mengqing Cao
2025-12-24 22:24:17 +08:00
committed by GitHub
parent fb3d6ca08c
commit e54630e01c
5 changed files with 18 additions and 96 deletions

View File

@@ -116,7 +116,6 @@ jobs:
pytest -sv --durations=0 tests/e2e/singlecard/test_xlite.py
pytest -sv --durations=0 tests/e2e/singlecard/pooling/
pytest -sv --durations=0 tests/e2e/singlecard/compile/test_norm_quant_fusion.py
pytest -sv --durations=0 tests/e2e/singlecard/test_cross_layer_attn_model.py
pytest -sv --durations=0 tests/e2e/singlecard/test_multistream_overlap_shared_expert.py
# ------------------------------------ v1 spec decode test ------------------------------------ #