[Bugfix][APC] Fix accuracy issue on prefix caching with AscendScheduler (#2714)

### What this PR does / why we need it? Fix accuracy issue on prefix caching with AscendScheduler ### How was this patch tested? CI passed with `test_prefix_cache_with_ascend_scheduler` - vLLM version: v0.10.1.1 - vLLM main: 6997a25ac6 --------- Signed-off-by: MengqingCao <cmq0113@163.com>
2025-09-04 08:22:46 +08:00
parent df88a2ecc8
commit 984bd7c13a
2 changed files with 7 additions and 4 deletions
--- a/.github/workflows/vllm_ascend_test.yaml
+++ b/.github/workflows/vllm_ascend_test.yaml
@@ -291,6 +291,6 @@ jobs:
          pytest -sv tests/e2e/multicard/test_offline_inference_distributed.py::test_sp_for_qwen3_moe

          #pytest -sv tests/e2e/multicard/test_pipeline_parallel.py
-          #pytest -sv tests/e2e/multicard/test_prefix_caching.py
+          pytest -sv tests/e2e/multicard/test_prefix_caching.py
          pytest -sv tests/e2e/multicard/test_qwen3_moe.py
          pytest -sv tests/e2e/multicard/test_torchair_graph_mode.py