Commit Graph

10 Commits

Author SHA1 Message Date
zhangxinyuehfad
06ccce1ddf [FOLLOWUP] fix name and format in accuracy test (#1288) (#1435)
### What this PR does / why we need it?
fix accuracy test:
1. fix accuracy report
like:https://vllm-ascend--1429.org.readthedocs.build/en/1429/developer_guide/evaluation/accuracy_report/Qwen2.5-7B-Instruct-V0.html
2. fix create pr for report

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-06-26 00:26:54 +08:00
zhangxinyuehfad
0060886a37 [CI]Update accuracy report test (#1288)
### What this PR does / why we need it?
Update accuracy report test
1. Add Record commit hashes and GitHub links for both vllm and
vllm-ascend in accuracy reports
2. Add accuracy result verification checks to ensure output correctness
3. Creat PR via forked repository workflow

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
dense-accuracy-test:
https://github.com/vllm-project/vllm-ascend/actions/runs/15745619485
create pr via forked repository workflow:
https://github.com/zhangxinyuehfad/vllm-ascend/actions/runs/15747013719/job/44385134080
accuracy report pr:
https://github.com/vllm-project/vllm-ascend/pull/1292

Currently, the accuracy report used is old and needs to be merged into
pr, retest, update new report, then close #1292 .


Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-06-25 14:10:34 +08:00
Yikun Jiang
a95afc011e [CI] Enable merge trigger unit test and accuracy test schedule job (#1345)
### What this PR does / why we need it?
- Enable merge trigger unit test and accuracy test schedule job
- Pin lm-eval==0.4.8 to resovle Qwen3 8B accuracy
### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-06-22 17:21:57 +08:00
Mengqing Cao
96fa7ff63b [DP][V1] Fix rank set in DP scenario & Bump torch-npu version to 2.5.1.post1.dev20250528 (#1235)
### What this PR does / why we need it?
1. Fix rank set in DP scenario. The new poc version of torch-npu support
setting `ASCEND_RT_VISIBLE_DEVICES` dynamically, thus we could use the
rank set in `DPEngineCoreProc` directly instead of calculating local
rank across dp by hand in the patched `_init_data_parallel`

Closes: https://github.com/vllm-project/vllm-ascend/issues/1170

2. Bump torch-npu version to 2.5.1.post1.dev20250528

Closes: https://github.com/vllm-project/vllm-ascend/pull/1242
Closes: https://github.com/vllm-project/vllm-ascend/issues/1232


### How was this patch tested?
CI passed with new added test.

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Icey <1790571317@qq.com>
Co-authored-by: Icey <1790571317@qq.com>
2025-06-16 23:09:53 +08:00
wangxiyuan
4f5964420e [CI] Upgrade vllm to 0.9.1 (#1165)
1. upgrade vllm to 0.9.1. 0.9.0 is not supported for main branch now.
keep doc to 0.9.0 until we release the first 0.9.1 release.
2. disable V0 test for PR
3. move actionlint check to lint job

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-06-11 16:33:11 +08:00
zhangxinyuehfad
e68e81f2ce [CI] Make accuarcy CI and report work (#1078)
### What this PR does / why we need it?
Make accuarcy CI and report work

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manaully review

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-06-10 14:35:44 +08:00
Yikun Jiang
9e855b70be Adjust concurrency group for each npu workflow (#1068)
### What this PR does / why we need it?
Adjust concurrency group for each npu workflow
- for pd and benchmarks share the static-08-01, so only one job can runs
on
- other job one PR/schedule should have only 1 job runs

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-06-05 09:17:04 +08:00
Yikun Jiang
f24375f318 Enable accuracy test for PR labeled with "*accuracy-test" (#1040)
### What this PR does / why we need it?
This PR enable accuracy test for PR labeled with "*accuracy-test" and
workflow_dispatch.

Only one model test running for each type test to reduce excution time.

- The dense test costs about `25mins` to complete (gsm8k 7mins, ~mmlu
3h24mins,~ cEval 18mins)
- The vl test costs about `40mins` to complete


In futute, we might consider enable all job test as nightly schedule
job.

Below is mainly changes:
- the dense/vl accuracy test will be triggered by lableling
`accuracy-test` and `ready-for-test`
- the dense accuracy test will be triggered by lableling
`dense-accuracy-test` and `ready-for-test`
- the vl accuracy test will be triggered by lableling `vl-accuracy-test`
and `ready-for-test`
- accuracy test will also be triggered by workflow_dispatch
- Support V1 and V0 for qwen and V0 for VL

For PR test we also generate summary in test summary.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- CI passed with accuracy-test label
- Preview:
https://github.com/vllm-project/vllm-ascend/actions/runs/15407628722?pr=1040

Closes: https://github.com/vllm-project/vllm-ascend/pull/953

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: hfadzxy <starmoon_zhang@163.com>
2025-06-03 15:38:13 +08:00
hfadzxy
4a2505f81f [accuracy test]Update cann version and huggingface-hub version for Qwen3 (#823)
### What this PR does / why we need it?
1.  update cann version to 8.1.0 for multimodal
2.  fix huggingface-hub version to adapt to qwen3
3.  change Qwen3-8B to Qwen-8B-Base,

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-05-12 19:12:48 +08:00
hfadzxy
affca6f348 [Test] Add accuracy test report workflow (#542)
### What this PR does / why we need it?
1. Provide accuracy test report for development branch release.
2. Models and datasets for accuracy test:
    
| Model | datasets |
|---------------------------- | --------------------------- | 
| Qwen2.5-7B-Instruct        |  ceval-val, gsm8k, mmlu  |
| Qwen3-8B                        |  ceval-val, gsm8k, mmlu  |
| Llama-3.1-8B-Instruct      |  ceval-val, gsm8k, mmlu  |
| Qwen2.5-VL-7B-Instruct  |           mmmu_val             |

### Does this PR introduce _any_ user-facing change?
This PR will display the accuracy test report of the release versionin
docs/source/developer_guide/accuracy_report。
Qwen2.5-7B-Instruct.md
Qwen3-8B.md
Llama-3.1-8B-Instruct.md
Qwen2.5-VL-7B-Instruct .md

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-04-30 14:53:58 +08:00