Commit Graph

3 Commits

Author SHA1 Message Date
zhangxinyuehfad
bc5ca2c856 [0.18.0][Bugfix] Restore VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to original value for nightly test (#8794)
### What this PR does / why we need it?
PR #8618 renamed `VLLM_NIXL_ABORT_REQUEST_TIMEOUT` to
`VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT` and simultaneously reduced the
timeout value from 300000 to 480 seconds in the nightly test configs.
The 480s value is far too short for heavy multi-node workloads (DeepSeek
V3/R1 under W8A8 + EP), causing [spurious abort-request
timeouts](https://github.com/vllm-project/vllm-ascend/actions/runs/25067539406/job/73441223206)
in CI.

This PR restores the timeout value to the original 300000 to fix the
nightly test failures introduced by #8618.

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-04-29 14:31:12 +08:00
herizhen
ff76c6780e [releases/v0.18.0][Doc][Misc] Modifying Configuration Parameters (#8618)
### What this PR does / why we need it?
This PR renames the environment variable VLLM_NIXL_ABORT_REQUEST_TIMEOUT
to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to align with the Mooncake
connector naming convention. It also updates the documentation and test
configurations to reflect this change and adjusts the suggested timeout
value in the documentation to 480 seconds for consistency.

### Does this PR introduce _any_ user-facing change?
Yes. The environment variable for configuring the abort request timeout
has been renamed. Users should update their environment settings from
VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT.

### How was this patch tested?
The changes were verified by updating the corresponding test
configuration files and ensuring consistency across the documentation.

---------

Signed-off-by: herizhen <1270637059@qq.com>
Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
2026-04-23 16:23:31 +08:00
starmountain1997
2cb9f76a0f [CI] Ds32 ep aime2025 (#8496)
Backport of #7882 to releases/v0.18.0. Adds aime2025 benchmark test for
DeepSeek-V3.2-W8A8 EP with disaggregated prefill on A3 (4-node, 16 NPUs
per node, accuracy benchmark baseline 66.67%).

Signed-off-by: guozr <guozr1997@hotmail.com>
Co-authored-by: guozr <guozr1997@hotmail.com>
2026-04-21 19:49:06 +08:00