10 Commits

Author SHA1 Message Date
SILONG ZENG
70606e0bb9 [Test]update accuracy test of models (#4911)
### What this PR does / why we need it?
Delete accuracy tests for models that are no longer retained:
- Meta-Llama-3.1-8B-Instruct
- llava-1.5-7b-hf
- InternVL2-8B.yaml
- InternVL2_5-8B.yaml
- InternVL3-8B.yaml

Add accuracy tests for the new models:
- Llama-3.2-3B-Instruct
- llava-onevision-qwen2-0.5b-ov-hf
- Qwen3-VL-30B-A3B-Instruct

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: MrZ20 <2609716663@qq.com>
2025-12-15 15:04:20 +08:00
zhangyiming
c95c271538 [E2E] Optimize nightly testcase. (#4886)
### What this PR does / why we need it?
Optimize nightly testcase.
Changes:
- tests/e2e/nightly/multi_node/config/models/Qwen3-235B-A3B.yaml: Add
accuracy and performance benchmark
- tests/e2e/models/configs/Qwen3-8B-Base.yaml: Delete
- tests/e2e/models/configs/internlm-7b.yaml: Change to
internlm3-8b-instruct
- tests/e2e/nightly/models/test_deepseek_r1_w8a8_eplb.py: Change to
DeepSeek-R1-0528-W8A8 model

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: menogrey <1299267905@qq.com>
2025-12-11 10:15:39 +08:00
zhangyiming
66b0781840 [E2E] Refactor the e2e testcases. (#4789)
### What this PR does / why we need it?
Refactor the e2e testcases.
- tests/e2e/multicard/test_weight_loader.py: Remove the unused code.
- tests/e2e/singlecard/multi-modal/test_internvl.py: Move to accuracy
test.
- tests/e2e/singlecard/test_aclgraph.py: Rename the file.
- tests/e2e/singlecard/test_embedding_aclgraph.py : Combine with
tests/e2e/singlecard/test_bge_model.py
- tests/e2e/singlecard/test_completion_with_prompt_embeds.py: Delete
eager mode and modify model to Qwen3-0.6B
- tests/e2e/singlecard/test_quantization.py: Modify model to
Qwen3-0.6B-W8A8
- tests/e2e/singlecard/test_vlm.py: Modify model to Qwen3-VL-8B

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: menogrey <1299267905@qq.com>
2025-12-11 10:15:00 +08:00
SILONG ZENG
7132ae8532 [CI]Cleanup accurary test (#4861)
### What this PR does / why we need it?
Delete accuracy testing of some models:
- Qwen2-VL-7B-Instruct
- Qwen2.5-VL-7B-Instruct
- gemma-2-9b-it
- DeepSeek-V2-Lite

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: MrZ20 <2609716663@qq.com>
2025-12-10 14:13:56 +08:00
zhangxinyuehfad
b6afec73e1 [Test] Add accuracy nightly test for new models (#4262)
### What this PR does / why we need it?
Add accuracy nightly test for new models:

PaddlePaddle/ERNIE-4.5-21B-A3B-PT
LLM-Research/Molmo-7B-D-0924
LLM-Research/gemma-2-9b-it
LLM-Research/gemma-3-4b-it
Shanghai_AI_Laboratory/internlm-7b
llava-hf/llava-1.5-7b-hf

- vLLM version: v0.11.2

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-12-01 22:28:46 +08:00
Canlin Guo
1c677c3b87 [Test][Accuracy] Add accuracy evaluation config for InternVL3_5-8B (#3964)
### What this PR does / why we need it?

To continuously monitor the accuracy of the InternVL3_5-8B model, this
PR adds the corresponding configuration file to the CI. We need to add
the `-hf` suffix to avoid incompatibility with the `lm-eval`
preprocessor.

### How was this patch tested?

`pytest -sv ./tests/e2e/models/test_lm_eval_correctness.py --config
./tests/e2e/models/configs/InternVL3_5-8B.yaml`


- vLLM version: v0.11.0
- vLLM main:
83f478bb19

Signed-off-by: gcanlin <canlinguosdu@gmail.com>
2025-11-12 09:05:55 +08:00
ZengSilong
dc1a6cb503 [Test]Add accuracy test for multiple models (#3823)
### What this PR does / why we need it?
Add accuracy test for multiple models:
- Meta_Llama_3.1_8B_Instruct
- Qwen2.5-Omni-7B
- Qwen3-VL-8B-Instruct

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

---------

Signed-off-by: MrZ20 <2609716663@qq.com>
2025-11-04 14:46:39 +08:00
Yikun Jiang
cd69385dab Add models test and add serval new models yaml (#3394)
### What this PR does / why we need it?
This PR added Add accuracy CI for servals new models
- `ascend test / accuracy` is for PR triggered check popluar models
accuracy
- `ascedn test / models` is for accuracy report, full models test,
nightly model test
- Add Qwen2-Audio-7B-Instruct, Qwen2-VL-7B-Instruct, Qwen3-8B,
Qwen3-VL-30B-A3B-Instruct

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Closes: https://github.com/vllm-project/vllm-ascend/pull/2330
Closes: https://github.com/vllm-project/vllm-ascend/pull/3362


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
2025-10-12 17:27:50 +08:00
zhangxinyuehfad
c90a6d3658 [Test] Update the format of the accuracy report (#3081)
### What this PR does / why we need it?
Update the format of the accuracy report

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.10.2
- vLLM main:
c60e6137f0

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-09-22 14:10:03 +08:00
Icey
0bd5ff5299 Fix accuracy test config and add DeepSeek-V2-Lite test (#2261)
### What this PR does / why we need it?
This PR fix accuracy test related to
https://github.com/vllm-project/vllm-ascend/pull/2073, users can now
perform accuracy tests on multiple models simultaneously and generate
different report files by running:

```bash
cd ~/vllm-ascend
pytest -sv ./tests/e2e/models/test_lm_eval_correctness.py \
          --config-list-file ./tests/e2e/models/configs/accuracy.txt
```

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
<img width="1648" height="511" alt="image"
src="https://github.com/user-attachments/assets/1757e3b8-a6b7-44e5-b701-80940dc756cd"
/>


- vLLM version: v0.10.0
- vLLM main:
766bc8162c

---------

Signed-off-by: Icey <1790571317@qq.com>
2025-08-08 11:09:16 +08:00