Commit Graph

4 Commits

Author SHA1 Message Date
zhangxinyuehfad
bc5ca2c856 [0.18.0][Bugfix] Restore VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to original value for nightly test (#8794)
### What this PR does / why we need it?
PR #8618 renamed `VLLM_NIXL_ABORT_REQUEST_TIMEOUT` to
`VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT` and simultaneously reduced the
timeout value from 300000 to 480 seconds in the nightly test configs.
The 480s value is far too short for heavy multi-node workloads (DeepSeek
V3/R1 under W8A8 + EP), causing [spurious abort-request
timeouts](https://github.com/vllm-project/vllm-ascend/actions/runs/25067539406/job/73441223206)
in CI.

This PR restores the timeout value to the original 300000 to fix the
nightly test failures introduced by #8618.

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-04-29 14:31:12 +08:00
herizhen
ff76c6780e [releases/v0.18.0][Doc][Misc] Modifying Configuration Parameters (#8618)
### What this PR does / why we need it?
This PR renames the environment variable VLLM_NIXL_ABORT_REQUEST_TIMEOUT
to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to align with the Mooncake
connector naming convention. It also updates the documentation and test
configurations to reflect this change and adjusts the suggested timeout
value in the documentation to 480 seconds for consistency.

### Does this PR introduce _any_ user-facing change?
Yes. The environment variable for configuring the abort request timeout
has been renamed. Users should update their environment settings from
VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT.

### How was this patch tested?
The changes were verified by updating the corresponding test
configuration files and ensuring consistency across the documentation.

---------

Signed-off-by: herizhen <1270637059@qq.com>
Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
2026-04-23 16:23:31 +08:00
pz1116
3effc4bc70 [Doc][KV Pool]Revision KV Pool User Guide (#7434)
### What this PR does / why we need it?
Revise the KV Pool user guide:
1. Revise Mooncake environment variables and kvconnector extra configs.
2. Delete `use_ascend_direct` in kv connector extra config as it is
deprecated
3. Delete `kv_buffer_device` and `kv_rank` in P2P mooncake config
4. Unifies default `max-model-len` and `max-num-batch-tokens` in
examples given.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.17.0
- vLLM main:
4497431df6

---------

Signed-off-by: Pz1116 <zpbzpb123123@gmail.com>
Co-authored-by: Chao Lei <leichao139636@163.com>
2026-03-19 10:13:13 +08:00
zhangxinyuehfad
566c367a10 [CI] Add DeepSeek-V3.2 large EP nightly ci (#6378)
### What this PR does / why we need it?

Add DeepSeek-V3.2 nightly ci

Fix PD routing to exclude headless nodes when collecting
prefiller/decoder IPs

- vLLM version: v0.14.1
- vLLM main:
dc917cceb8

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2026-03-04 16:15:56 +08:00