[test] add w4a8 accuracy case (#5110)

### What this PR does / why we need it?

This PR add w4a8  accuracy testcase for e2e test

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

By running the test

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: cuikai (C) <c00827167@china.huawei.com>
Co-authored-by: cuikai (C) <c00827167@china.huawei.com>
This commit is contained in:
ck-hw-1018
2025-12-18 14:10:14 +08:00
committed by GitHub
parent 39fb9e7c83
commit 71e544e259
2 changed files with 34 additions and 18 deletions

View File

@@ -192,11 +192,11 @@ jobs:
# To avoid oom, we need to run the test in a single process.
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_DeepSeek_multistream_moe
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_Qwen3_W4A8DYNAMIC
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_DeepSeek_W4A8DYNAMIC
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_sp_for_qwen3_moe
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_fc2_for_qwen3_moe
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_Qwen_Dense_with_flashcomm_v1
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_Qwen_Dense_with_prefetch_mlp_weight
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_deepseek_w4a8_accuracy
pytest -sv --durations=0 tests/e2e/multicard/test_prefix_caching.py
pytest -sv --durations=0 tests/e2e/multicard/test_pipeline_parallel.py
@@ -265,7 +265,6 @@ jobs:
VLLM_USE_MODELSCOPE: True
run: |
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_DeepSeek_multistream_moe
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_DeepSeek_W4A8DYNAMIC
pytest -sv --durations=0 tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_Kimi_K2_Thinking_W4A16
pytest -sv --durations=0 tests/e2e/multicard/test_data_parallel_tp2.py