[1/N][CI] Refactor accuracy test (#5400)

### What this PR does / why we need it? 1. Accuracy testing no longer compares eager and graph modes; instead, it directly extracts the golden result under the graph mode configuration (the implicit purpose of this case is to verify whether modifications affect existing results) 2. Next step: finer-grained supervision of logits/sampler results ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: release/v0.13.0 - vLLM main: 254f6b9867 Signed-off-by: wangli <wangli858794774@gmail.com>
2026-01-07 20:58:15 +08:00
parent b94fc13d3f
commit 1165b2c863
4 changed files with 231 additions and 353 deletions
--- a/tests/e2e/multicard/2-cards/test_shared_expert_dp.py
+++ b/tests/e2e/multicard/2-cards/test_shared_expert_dp.py
@@ -9,7 +9,6 @@ from tests.e2e.model_utils import check_outputs_equal
 MODELS = [
    "deepseek-ai/DeepSeek-V2-Lite",
 ]
-os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"


@pytest.mark.parametrize("model", MODELS)