### What this PR does / why we need it?
Revert [KV-Sharing] Support KV-Sharing feature in CLA models (#4138) as
it causes deepseek v3.2 hang error
- vLLM version: release/v0.13.0
- vLLM main:
5fbfa8d9ef
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
1
.github/workflows/_e2e_test.yaml
vendored
1
.github/workflows/_e2e_test.yaml
vendored
@@ -116,7 +116,6 @@ jobs:
|
||||
pytest -sv --durations=0 tests/e2e/singlecard/test_xlite.py
|
||||
pytest -sv --durations=0 tests/e2e/singlecard/pooling/
|
||||
pytest -sv --durations=0 tests/e2e/singlecard/compile/test_norm_quant_fusion.py
|
||||
pytest -sv --durations=0 tests/e2e/singlecard/test_cross_layer_attn_model.py
|
||||
pytest -sv --durations=0 tests/e2e/singlecard/test_multistream_overlap_shared_expert.py
|
||||
|
||||
# ------------------------------------ v1 spec decode test ------------------------------------ #
|
||||
|
||||
Reference in New Issue
Block a user