41 Commits

Author SHA1 Message Date
zhangxinyuehfad
886756aea0 [Bugfix][CI] Fix aisbench installation to avoid Gitee authentication (#7536)
### What this PR does / why we need it?
- Pass GITEE_USERNAME (var) and GITEE_TOKEN (secret) as Docker build
  args in nightly image build so Dockerfile can authenticate to Gitee
- In Dockerfile.nightly.a2/a3, embed credentials into clone URL to
  avoid auth failure during `git clone`
- In single-node and multi-node PR test workflows, backup the
  pre-installed benchmark from the nightly image before wiping
  vllm-ascend, then restore it instead of re-cloning from Gitee,
  which is inaccessible from fork PR contexts

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.18.0
- vLLM main:
8b6325758c

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-03-23 20:16:51 +08:00
zhangxinyuehfad
67d40f23fd [CI]Upgrade niglty multi-node-tests max-parallel to 2 (#7035)
### What this PR does / why we need it?

1. Increase nightly multi-node test max-parallel from 1 to 2, and fix
resource conflicts that arise when tests run concurrently.
2. Fix parse-trigger job: Add an if condition so it only runs on
schedule, workflow_dispatch, or PRs labeled nightly-test
3. Adjust nightly schedule: Shift trigger time from 24:00 to 23:45
(UTC+8)

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.16.0
- vLLM main:
4034c3d32e

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-03-10 16:25:51 +08:00
zhangxinyuehfad
1e4017e3fa [CI] support nightly ci for per pr by labels (#6483)
### What this PR does / why we need it?

This PR refactors the nightly CI workflows (A2 and A3) to support
running tests against a specific PR's code, in addition to the existing
scheduled/dispatch runs using pre-built images.

#### Motivation:
Previously, nightly tests could only be triggered by schedule or
workflow_dispatch, always using the pre-built nightly image. This change
allows developers to trigger nightly tests against their own PR's source
code, enabling early validation without waiting for a nightly build.

#### Changes
Trigger logic (parse-trigger job)

A new parse-trigger job is introduced in both
schedule_nightly_test_a2.yaml and schedule_nightly_test_a3.yaml to
centralize trigger evaluation:

`schedule / workflow_dispatch`: runs all tests with the pre-built image
(existing behavior preserved)
`pull_request (labeled + synchronize)`: runs only when:The PR has the
nightly-test label, and /nightly [test-names] comment exists (latest one
wins)

1. /nightly or /nightly all — runs all tests
2. /nightly test1 test2 — runs only named tests (comma-wrapped for exact
matching)

#### How to trigger
1. Add the nightly-test label to your PR
2. Comment /nightly (all tests) or /nightly test1 test2 (specific tests)
4. Re-triggering: add another /nightly comment and push a new commit
(synchronize event)

### Does this PR introduce _any_ user-facing change?
None

### How was this patch tested?

- vLLM version: v0.14.1
- vLLM main:
dc917cceb8

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-03-05 16:46:37 +08:00
zhangxinyuehfad
566c367a10 [CI] Add DeepSeek-V3.2 large EP nightly ci (#6378)
### What this PR does / why we need it?

Add DeepSeek-V3.2 nightly ci

Fix PD routing to exclude headless nodes when collecting
prefiller/decoder IPs

- vLLM version: v0.14.1
- vLLM main:
dc917cceb8

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-03-04 16:15:56 +08:00
Xiaoshuang Wang
f7a8befc20 [CI] Upgrade CANN to 8.5.1 (#6897)
### What this PR does / why we need it?
[CI] Upgrade CANN to 8.5.1

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.16.0
- vLLM main:
15d76f74e2

Signed-off-by: wxsIcey <1790571317@qq.com>
2026-03-03 09:02:42 +08:00
Li Wang
ac9a7d1301 [Nightly] Increase VLLM_ENGINE_READY_TIMEOUT_S to avoid nightly failure (#6778)
### What this PR does / why we need it?
After some observation, I found some cases failed for timeout, just like
https://github.com/vllm-project/vllm-ascend/actions/runs/22280996034/job/64487867977#step:9:921
and
https://github.com/vllm-project/vllm-ascend/actions/runs/22315540111/job/64574590762#step:9:1809,
this may caused by the excessively long model loading time (currently we
are still loading weights from network storage), it is necessary to
adjust the timeout seconds 600s -> 1800s
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
9562912cea

Signed-off-by: wangli <wangli858794774@gmail.com>
2026-02-25 10:14:51 +08:00
SILONG ZENG
e2237819a9 [CI]Fixed the spell check function in typos.toml (#6753)
### What this PR does / why we need it?
The incorrect regular expression syntax `.*[UE4M3|ue4m3].*` actually
ignores all words containing any of the following characters: `u, e, 4,
m, 3, |`

```yaml
extend-ignore-identifiers-re = [".*Unc.*", ".*_thw",
    ".*UE8M0.*", ".*[UE4M3|ue4m3].*", ".*eles.*", ".*fo.*", ".*ba.*",
    ".*ot.*", ".*[Tt]h[rR].*"]
```
===fix===>
```yaml
extend-ignore-identifiers-re = [".*Unc.*", ".*_thw",
    ".*UE8M0.*", ".*(UE4M3|ue4m3]).*", ".*eles.*", ".*fo.*", ".*ba.*",
    ".*ot.*", ".*[Tt]h[rR].*"]
```

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
9562912cea

Signed-off-by: MrZ20 <2609716663@qq.com>
2026-02-14 11:57:26 +08:00
SILONG ZENG
6bc44bf49b [CI]fix nightly multi node test error for wait for pod ready (#6675)
### What this PR does / why we need it?
Fixes the issue where nightly multi-node tests hang during the "wait for
pod ready" stage due to strict shell mode.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
13397841ab

Signed-off-by: MrZ20 <2609716663@qq.com>
2026-02-11 18:11:00 +08:00
Li Wang
d018aeb5fa [Image] Bump mooncake version to v0.3.8.post1 (#6428)
### What this PR does / why we need it?
This patch bump the mooncake version to the latest
[release](https://github.com/kvcache-ai/Mooncake/releases/tag/v0.3.8.post1)
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
test is locally
>>> from mooncake.engine import TransferEngine
- vLLM version: v0.14.1
- vLLM main:
dc917cceb8

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2026-02-06 10:54:03 +08:00
wangxiyuan
f8e76a49fa [CI] Upgrade trasnformers version (#6307)
Upgrade transformers to >=4.56.4

- vLLM version: v0.14.1
- vLLM main:
dc917cceb8

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2026-01-28 14:06:39 +08:00
meihanc
e54d294df3 [CI]Install clang in dokerfile for triton ascend (#4409)
### What this PR does / why we need it?
Install clang in dokerfile for triton ascend

- vLLM version: v0.13.0
- vLLM main:
d68209402d

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
2026-01-22 19:01:28 +08:00
wangxiyuan
69740039b7 [CI] Upgrade CANN to 8.5.0 (#6070)
### What this PR does / why we need it?
1. Upgrade CANN to 8.5.0
2. move triton-ascend 3.2.0 to requirements

note: we skipped the two failed e2e test, see
https://github.com/vllm-project/vllm-ascend/issues/6076 for more detail.
We'll fix it soon.


### How was this patch tested?
Closes: https://github.com/vllm-project/vllm-ascend/issues/5494

- vLLM version: v0.13.0
- vLLM main:
d68209402d

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2026-01-22 09:29:50 +08:00
meihanc
53bfb38192 [CI]Update triton ascend version in 3.2.0 (#6067)
### What this PR does / why we need it?
update triton ascend version in 3.2.0

- vLLM version: v0.13.0
- vLLM main:
d68209402d

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
2026-01-21 16:02:23 +08:00
zhangxinyuehfad
750c06c78a [CI] Add DeepSeek-V3.2-W8A8 nightly ci test (#4633)
### What this PR does / why we need it?
Add DeepSeek-V3.2-W8A8 nightly ci test:

DeepSeek-V3.2-W8A8 1node DP2+TP8
:tests/e2e/nightly/models/test_deepseek_v3_2_w8a8.py

### Does this PR introduce _any_ user-facing change

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-01-20 21:05:15 +08:00
meihanc
80fbb1b6b1 [CI]Fix nightly clang installation following previous attempt (#5907)
### What this PR does / why we need it?
This PR fixes the issue where the previous PR
https://github.com/vllm-project/vllm-ascend/pull/5733 failed to install
Clang in nightly environment.

- vLLM version: v0.13.0
- vLLM main:
bde38c11df

---------

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
2026-01-15 14:18:11 +08:00
Li Wang
f34b3b8ee9 [nightly] Remove node tolerations for hk cluster (#5896)
### What this PR does / why we need it?
Since we have upgrade all the nodes' `cann` HDK version to `25.3rc1`, we
should not limit nightly schedule to the specific nodes
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
bde38c11df

Signed-off-by: wangli <wangli858794774@gmail.com>
2026-01-15 08:55:06 +08:00
meihanc
a9f730b853 [bugfix]Intermittent CI failure in the triton runtime jit (#5733)
### What this PR does / why we need it?
fix bug : https://github.com/vllm-project/vllm-ascend/issues/5634
Intermittent CI failure due to a compilation error in the triton
operator
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef

---------

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
2026-01-14 22:58:08 +08:00
Li Wang
75c92a3640 [CI] Move nightly-a2 test to hk (#5807)
### What this PR does / why we need it?
This patch initial testing involved connecting two nodes from the HK
region to nightly A2.

- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2026-01-12 22:58:35 +08:00
meihanc
6315a31399 [CI] Add triton ascend in nightly CI (#5716)
### What this PR does / why we need it?
Add triton ascend in nightly
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef

---------

Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
2026-01-08 21:17:32 +08:00
Li Wang
2ee17e50a1 [2/N] Upgrade nightly doc (#5534)
### What this PR does / why we need it?
Follow up https://github.com/vllm-project/vllm-ascend/pull/5479, upgrade
the corresponding doc for developers

- vLLM version: v0.13.0
- vLLM main:
45c1ca1ca1

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-31 09:11:42 +08:00
Li Wang
e760aae1df [1/N] Refactor nightly test structure (#5479)
### What this PR does / why we need it?
This patch is a series of refactoring actions, including clarifying the
directory structure of nightly tests, refactoring the config retrieval
logic, and optimizing the workflow, etc. This is the first step:
refactoring the directory structure of nightly to make it more readable
and logical.

- vLLM version: v0.13.0
- vLLM main:
5326c89803

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-30 19:03:02 +08:00
Li Wang
1d81bfaed1 Fix nightly (#5413)
### What this PR does / why we need it?
This pacth mainly do the following things:
1. Bugfix for multi_node_tests log, log names must be unique when
uploading logs.
2. Optimize `get_cluster_ips` logic, increase the max retry times for
robustness
3. Abandoned the existing gh-proxy temporarily until it is stable
enough.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: release/v0.13.0
- vLLM main:
81786c8774

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-27 18:16:46 +08:00
Li Wang
c2f776b846 [Nightly] Initial logging for nightly multi-node testing (#5362)
### What this PR does / why we need it?
Currently, our multi-node logs only show the master node's logs (via the
Kubernetes API), which is insufficient for effective problem
localization if other nodes experience issues. Therefore, this pull
request adds the ability to upload logs for other nodes.

Next plan: Output structured directory logs, including logs from each
node and the polog.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: release/v0.13.0
- vLLM main:
bc0a5a0c08

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-26 11:39:07 +08:00
Li Wang
0f92d34a70 [CI] Pull latest vllm-ascend src before tests (#4988)
### What this PR does / why we need it?
Currently, our image build suffers from errors during cross-compilation,
which causing the image to fail to build sometimes(see
https://github.com/vllm-project/vllm-ascend/actions/runs/20152861650/job/57849208186).
This results in the nightly test code not being the latest version.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-13 19:04:14 +08:00
Li Wang
283bc5c7ba [Nightly] Optimize nightly CI (#4509)
### What this PR does / why we need it?
1. Optimize multi-node waiting logic
2. Remove the `tee` pipeline for logs, which will lead to hang issue

### How was this patch tested?


- vLLM version: v0.12.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.12.0

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-04 22:31:07 +08:00
zhangxinyuehfad
8813832387 [Test] Add GLM-4.5 nightly test (#4225)
### What this PR does / why we need it?
Add GLM-4.5 nightly test

- vLLM version: v0.11.2

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-12-01 22:31:56 +08:00
zhangxinyuehfad
67f2b3a031 [Test] Add deepseek v3.2 exp nightly test (#4191)
### What this PR does / why we need it?

- skip the nightly image build when the github event is pull_request
- set imagepullpolicy as alway for multi_node test
- move multi_node tests ahead to have some resource clean first
- do not relevant nightly image build with nightly tests for tolerance

- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Co-authored-by: wangli <wangli858794774@gmail.com>
2025-11-14 15:46:10 +08:00
Li Wang
7294f89e43 [CI] Add daily images build for nightly ci (#3989)
### What this PR does / why we need it?
Given the current excessively long build time of our nightly-ci, I
recommend installing necessary, confirmed versions of packages in the
Docker image to reduce the time required for integration testing.
Including Mooncake vllm with fixed tags, This is expected to reduce
nightly-ci duration by 2 hours.

- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-11-13 20:10:12 +08:00
zhangxinyuehfad
b77b4f1abf [Test] Add nightly test for DeepSeek-V3.2-Exp (#3908)
### What this PR does / why we need it?
Add nightly test for DeepSeek-V3.2-Exp


### How was this patch tested?
test action:

https://github.com/vllm-project/vllm-ascend/actions/runs/19156153634/job/54757008557?pr=3908


- vLLM version: v0.11.0
- vLLM main:
83f478bb19

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-11-11 10:29:57 +08:00
Li Wang
259eb25f88 [CI] Quick fix mooncake for nightly-ci (#4028)
### What this PR does / why we need it?
Since we have upgraded to CANN 8.3rc1, we will no longer use the
privately maintained Mooncake repository, but instead use the official
release released by Mooncake:
https://github.com/kvcache-ai/Mooncake/releases/tag/v0.3.7.post2 .

Next step: this is only a temporary solution. We will integrate mooncake
into the vllm-ascend base image later for easier use. see
https://github.com/vllm-project/vllm-ascend/pull/3989
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-11-06 18:46:00 +08:00
wangxiyuan
cc2cd42ad3 Upgrade CANN to 8.3.rc1 (#3945)
### What this PR does / why we need it?
This PR upgrade CANN from 8.2rc1 to 8.3rc1 and remove the CANN version
check logic.

TODO: we notice that UT runs failed with CANN 8.3 image. So the base
image for UT is still 8.2. We'll fix it later.


- vLLM version: v0.11.0
- vLLM main:
83f478bb19

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-11-03 20:21:07 +08:00
Li Wang
8f222f21f1 [CI][Nightly] Fix mooncake build (#3958)
### What this PR does / why we need it?
Fix https://github.com/vllm-project/vllm-ascend/pull/3943

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-11-03 20:07:47 +08:00
Li Wang
d0cc9c1203 [CI][Nightly] Correct the commit hash available for mooncake (#3943)
### What this PR does / why we need it?
Because the previous commit hash was accidentally deleted or
overwritten. This patch correct the commit hash available for
https://github.com/AscendTransport/Mooncake to make nightly ci happy
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-11-01 21:52:16 +08:00
Li Wang
eb0a2ee2d0 [CI] Optimize nightly CI (#3898)
### What this PR does / why we need it?
This patch mainly fix the the problem of not being able to determine the
exit status of the pod's entrypoint script and some other tiny
optimizations:
1. Shorten wait for server timeout
2. fix typo
3. fix the issue of ais_bench failing to correctly access the proxy URL
in a PD separation scenario.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0
- vLLM main:
83f478bb19

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-10-30 23:42:20 +08:00
Li Wang
4a2ab13743 [CI] Optimize nightly CI (#3858)
### What this PR does / why we need it?
This patch optimize nightly CI:
1. Bug fixes ais_bench get None repo_type error
2. Fix A2 install kubectl error with arm arch
3. Fix the multi_node CI unable to determine whether the job was
successful error
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main:
83f478bb19

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-10-29 22:30:19 +08:00
Li Wang
90ae114569 [CI] Fix nightly CI (#3821)
### What this PR does / why we need it?
This patch fix the nightly CI runs
[failure](https://github.com/vllm-project/vllm-ascend/actions/runs/18848144365)

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.1

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-10-28 20:40:03 +08:00
Li Wang
f846bd20e4 [CI] Add multi-node test case for a2 (#3805)
### What this PR does / why we need it?
This patch add multi-node test case for a2
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.11.0rc3
- vLLM main:
c9461e05a4

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-10-27 23:10:17 +08:00
jiangyunfan1
9030106a14 [TEST]Add 2P1D multi node cases for nightly test (#3764)
### What this PR does / why we need it?
This PR adds the 2P1D multi node func/acc/perf test cases, we need test
them daily
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
by running the test

- vLLM version: v0.11.0rc3
- vLLM main:
c9461e05a4

---------

Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com>
Signed-off-by: wangli <wangli858794774@gmail.com>
Co-authored-by: wangli <wangli858794774@gmail.com>
2025-10-27 23:09:15 +08:00
Li Wang
7f73c28a24 [CI][Doc] Optimize multi-node CI (#3565)
### What this PR does / why we need it?
This pull request mainly do the following things:
1. Add a doc for multi-node CI, The main content is the mechanism
principle and how to contribute
2. Simplify the config yaml for more developer-friendly
3. Optimized the mooncake installation script to prevent accidental
failures during installation
4. Fix the workflow to ensure the kubernetes can be apply correctly
5. Add Qwen3-235B-W8A8 disaggregated_prefill test
6. Add GLM-4.5 multi dp test
7. Add 2p1d 4nodes disaggregated_prefill test
8. Refactor nightly tests
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main:
17c540a993

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-10-25 09:23:47 +08:00
Li Wang
286ae9003d [CI] Multi-Node CI scalable (#3611)
### What this PR does / why we need it?
This PR adds a jinja template for the k8s configuration file, prepare
for the upcoming 4-node CI
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-10-22 14:18:43 +08:00
Li Wang
4c4a8458a5 [CI] Refator multi-node CI (#3487)
### What this PR does / why we need it?
Refactor the multi-machine CI use case. The purpose of this PR is to
increase the ease of adding multi-machine CI use cases, allowing
developers to add multi-machine cluster model testing use cases
(including PD separation) by simply adding a new YAML configuration
file.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-10-17 09:04:31 +08:00