[1/N][CI] Refactor accuracy test (#5400)

### What this PR does / why we need it?
1. Accuracy testing no longer compares eager and graph modes; instead,
it directly extracts the golden result under the graph mode
configuration (the implicit purpose of this case is to verify whether
modifications affect existing results)
2. Next step: finer-grained supervision of logits/sampler results
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: release/v0.13.0
- vLLM main:
254f6b9867

Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
Li Wang
2026-01-07 20:58:15 +08:00
committed by GitHub
parent b94fc13d3f
commit 1165b2c863
4 changed files with 231 additions and 353 deletions

View File

@@ -9,7 +9,6 @@ from tests.e2e.model_utils import check_outputs_equal
MODELS = [
"deepseek-ai/DeepSeek-V2-Lite",
]
os.environ["VLLM_WORKER_MULTIPROC_METHOD"] = "spawn"
@pytest.mark.parametrize("model", MODELS)