xc-llm-ascend

EngineX/xc-llm-ascend

Fork 0

Commit Graph

Author	SHA1	Message	Date
zhangxinyuehfad	bc5ca2c856	[0.18.0][Bugfix] Restore VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to original value for nightly test (#8794 ) ### What this PR does / why we need it? PR #8618 renamed `VLLM_NIXL_ABORT_REQUEST_TIMEOUT` to `VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT` and simultaneously reduced the timeout value from 300000 to 480 seconds in the nightly test configs. The 480s value is far too short for heavy multi-node workloads (DeepSeek V3/R1 under W8A8 + EP), causing [spurious abort-request timeouts](https://github.com/vllm-project/vllm-ascend/actions/runs/25067539406/job/73441223206) in CI. This PR restores the timeout value to the original 300000 to fix the nightly test failures introduced by #8618. Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2026-04-29 14:31:12 +08:00
herizhen	ff76c6780e	[releases/v0.18.0][Doc][Misc] Modifying Configuration Parameters (#8618 ) ### What this PR does / why we need it? This PR renames the environment variable VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to align with the Mooncake connector naming convention. It also updates the documentation and test configurations to reflect this change and adjusts the suggested timeout value in the documentation to 480 seconds for consistency. ### Does this PR introduce _any_ user-facing change? Yes. The environment variable for configuring the abort request timeout has been renamed. Users should update their environment settings from VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT. ### How was this patch tested? The changes were verified by updating the corresponding test configuration files and ensuring consistency across the documentation. --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>	2026-04-23 16:23:31 +08:00
starmountain1997	2cb9f76a0f	[CI] Ds32 ep aime2025 (#8496 ) Backport of #7882 to releases/v0.18.0. Adds aime2025 benchmark test for DeepSeek-V3.2-W8A8 EP with disaggregated prefill on A3 (4-node, 16 NPUs per node, accuracy benchmark baseline 66.67%). Signed-off-by: guozr <guozr1997@hotmail.com> Co-authored-by: guozr <guozr1997@hotmail.com>	2026-04-21 19:49:06 +08:00

Author

SHA1

Message

Date

zhangxinyuehfad

bc5ca2c856

[0.18.0][Bugfix] Restore VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to original value for nightly test (#8794 )

### What this PR does / why we need it?
PR #8618 renamed `VLLM_NIXL_ABORT_REQUEST_TIMEOUT` to
`VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT` and simultaneously reduced the
timeout value from 300000 to 480 seconds in the nightly test configs.
The 480s value is far too short for heavy multi-node workloads (DeepSeek
V3/R1 under W8A8 + EP), causing [spurious abort-request
timeouts](https://github.com/vllm-project/vllm-ascend/actions/runs/25067539406/job/73441223206)
in CI.

This PR restores the timeout value to the original 300000 to fix the
nightly test failures introduced by #8618.

Signed-off-by: hfadzxy <starmoon_zhang@163.com>

2026-04-29 14:31:12 +08:00

herizhen

ff76c6780e

[releases/v0.18.0][Doc][Misc] Modifying Configuration Parameters (#8618 )

### What this PR does / why we need it?
This PR renames the environment variable VLLM_NIXL_ABORT_REQUEST_TIMEOUT
to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to align with the Mooncake
connector naming convention. It also updates the documentation and test
configurations to reflect this change and adjusts the suggested timeout
value in the documentation to 480 seconds for consistency.

### Does this PR introduce _any_ user-facing change?
Yes. The environment variable for configuring the abort request timeout
has been renamed. Users should update their environment settings from
VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT.

### How was this patch tested?
The changes were verified by updating the corresponding test
configuration files and ensuring consistency across the documentation.

---------

Signed-off-by: herizhen <1270637059@qq.com>
Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>

2026-04-23 16:23:31 +08:00

starmountain1997

2cb9f76a0f

[CI] Ds32 ep aime2025 (#8496 )

Backport of #7882 to releases/v0.18.0. Adds aime2025 benchmark test for
DeepSeek-V3.2-W8A8 EP with disaggregated prefill on A3 (4-node, 16 NPUs
per node, accuracy benchmark baseline 66.67%).

Signed-off-by: guozr <guozr1997@hotmail.com>
Co-authored-by: guozr <guozr1997@hotmail.com>

2026-04-21 19:49:06 +08:00

3 Commits