LeeWenquan
0d773efd70
[CI]Fix qwen3Next Nightly CI config ( #7903 )
...
### What this PR does / why we need it?
Fix qwen3Next Nightly CI config in 0.18.0.
backport: #7679
Signed-off-by: Your Name <you@example.com >
Co-authored-by: Your Name <you@example.com >
2026-04-03 16:46:25 +08:00
LeeWenquan
9615bc33fd
Fix Qwen3Next CI Config ( #7561 )
...
### What this PR does / why we need it?
This pr modifies qwen3Next nightly CI config.
(1) Add a nightly CI .
(2) Set a more precise accuracy standard
- vLLM version: v0.18.0
- vLLM main:
6a9cceb219
Signed-off-by: Your Name <you@example.com >
Co-authored-by: Your Name <you@example.com >
2026-03-24 17:08:17 +08:00
LeeWenquan
65eae6de7b
Add Ascend Ops recurrent_gated_delta_rule ( #6725 )
...
### What this PR does / why we need it?
Change recurrent_gated_delta_rule ops from triton to ascend C version
for better performance.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.15.0
- vLLM main:
9562912cea
---------
Signed-off-by: SunnyLee219 <3294305115@qq.com >
2026-03-09 14:14:14 +08:00
SILONG ZENG
859f2c25b9
[Nightly][Refactor]Migrate nightly single-node model tests from .py to .yaml ( #6503 )
...
### What this PR does / why we need it?
This PR refactors the nightly single-node model test by migrating test
configurations from Python scripts to a more maintainable `YAML-based`
format.
| Original PR | Python (`.py`) | YAML (`.yaml`) |
| :--- | :--- | :--- |
| [#3568 ](https://github.com/vllm-project/vllm-ascend/pull/3568 ) |
`test_deepseek_r1_0528_w8a8_eplb.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [#3631 ](https://github.com/vllm-project/vllm-ascend/pull/3631 ) |
`test_deepseek_r1_0528_w8a8.py` | `DeepSeek-R1-0528-W8A8.yaml` |
| [#5874 ](https://github.com/vllm-project/vllm-ascend/pull/5874 ) |
`test_deepseek_r1_w8a8_hbm.py` | `DeepSeek-R1-W8A8-HBM.yaml` |
| [#3908 ](https://github.com/vllm-project/vllm-ascend/pull/3908 ) |
`test_deepseek_v3_2_w8a8.py` | `DeepSeek-V3.2-W8A8.yaml` |
| [#5682 ](https://github.com/vllm-project/vllm-ascend/pull/5682 ) |
`test_kimi_k2_thinking.py` | `Kimi-K2-Thinking.yaml` |
| [#4111 ](https://github.com/vllm-project/vllm-ascend/pull/4111 ) |
`test_mtpx_deepseek_r1_0528_w8a8.py` | `MTPX-DeepSeek-R1-0528-W8A8.yaml`
|
| [#3733 ](https://github.com/vllm-project/vllm-ascend/pull/3733 ) |
`test_prefix_cache_deepseek_r1_0528_w8a8.py` |
`Prefix-Cache-DeepSeek-R1-0528-W8A8.yaml` |
| [#6543 ](https://github.com/vllm-project/vllm-ascend/pull/6543 ) |
`test_qwen3_235b_w8a8.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [#6543 ](https://github.com/vllm-project/vllm-ascend/pull/6543 ) |
`test_qwen3_235b_a22b_w8a8_eplb.py` | `Qwen3-235B-A22B-W8A8.yaml` |
| [#3973 ](https://github.com/vllm-project/vllm-ascend/pull/3973 ) |
`test_qwen3_30b_w8a8.py` | `Qwen3-30B-A3B-W8A8.yaml` |
| [#3541 ](https://github.com/vllm-project/vllm-ascend/pull/3541 ) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8.yaml` |
| [#3757 ](https://github.com/vllm-project/vllm-ascend/pull/3757 ) |
`test_qwq_32b.py` | `QwQ-32B.yaml` |
| [#5616 ](https://github.com/vllm-project/vllm-ascend/pull/5616 ) |
`test_qwen3_next_w8a8.py` | `Qwen3-Next-80B-A3B-Instruct-W8A8.yaml` |
| [#3541 ](https://github.com/vllm-project/vllm-ascend/pull/3541 ) |
`test_qwen2_5_vl_7b.py` | `Qwen2.5-VL-7B-Instruct.yaml` |
| [#5301 ](https://github.com/vllm-project/vllm-ascend/pull/5301 ) |
`test_qwen2_5_vl_7b_epd.py` | `Qwen2.5-VL-7B-Instruct-EPD.yaml` |
| [#3707 ](https://github.com/vllm-project/vllm-ascend/pull/3707 ) |
`test_qwen2_5_vl_32b.py` | `Qwen2.5-VL-32B-Instruct.yaml` |
| [#3676 ](https://github.com/vllm-project/vllm-ascend/pull/3676 ) |
`test_qwen3_32b_int8_a3_feature_stack3.py` |
`Qwen3-32B-Int8-A3-Feature-Stack3.yaml` |
| [#3709 ](https://github.com/vllm-project/vllm-ascend/pull/3709 ) |
`test_prefix_cache_qwen3_32b_int8.py` |
`Prefix-Cache-Qwen3-32B-Int8.yaml` |
| [#5395 ](https://github.com/vllm-project/vllm-ascend/pull/5395 ) |
`test_qwen3_next.py` | `Qwen3-Next-80B-A3B-Instruct-A2.yaml` |
| [#3474 ](https://github.com/vllm-project/vllm-ascend/pull/3474 ) |
`test_qwen3_32b.py` | `Qwen3-32B.yaml` |
| [#3541 ](https://github.com/vllm-project/vllm-ascend/pull/3541 ) |
`test_qwen3_32b_int8.py` | `Qwen3-32B-Int8-A2.yaml` |
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0
---------
Signed-off-by: MrZ20 <2609716663@qq.com >
2026-03-03 20:13:43 +08:00