### What this PR does / why we need it?
Fix the OOM (Out-of-Memory) error in the single-node-deepseek-v3-2-w8a8
nightly test of vllm-ascend:
- Reduced the value of HCCL_BUFFSIZE
- Lowered the gpu-memory-utilization
Optimize service-side performance:
Updated service-oriented configuration parameters (e.g., max-num-seqs,
cudagraph_capture_sizes, batch_size) to improve the inference
performance,so that the performance is closer to the optimal performance
of the current mainline.
Align performance baseline with main branch:
Updated the performance baseline according to the latest performance
data
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
The test has passed.
https://github.com/vllm-project/vllm-ascend/actions/runs/23734079080/job/69134387320?pr=7793
---------
Signed-off-by: wyh145 <1987244901@qq.com>
### What this PR does / why we need it?
This pr modifies qwen3Next nightly CI config.
(1) Add a nightly CI .
(2) Set a more precise accuracy standard
- vLLM version: v0.18.0
- vLLM main:
6a9cceb219
Signed-off-by: Your Name <you@example.com>
Co-authored-by: Your Name <you@example.com>
### What this PR does / why we need it?
Add acc nightly CI test cases for the GLM-4.7 model.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
through CI
- vLLM version: v0.17.0
- vLLM main:
4034c3d32e
---------
Signed-off-by: zjks98 <zhangjiakang4@huawei.com>
Co-authored-by: zjks98 <zhangjiakang4@huawei.com>
### What this PR does / why we need it?
1. Add nightly test on MiniMax-M2.5 with deployment method on A3
2. Add MiniMax-M2.5 deployment introduction to vllm-ascend docs
- vLLM version: v0.17.0
- vLLM main:
4034c3d32e
---------
Signed-off-by: limuyuan <limuyuan3@huawei.com>
Signed-off-by: SparrowMu <52023119+SparrowMu@users.noreply.github.com>
Co-authored-by: limuyuan <limuyuan3@huawei.com>
### What this PR does / why we need it?
Change recurrent_gated_delta_rule ops from triton to ascend C version
for better performance.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.15.0
- vLLM main:
9562912cea
---------
Signed-off-by: SunnyLee219 <3294305115@qq.com>