xc-llm-ascend

Author	SHA1	Message	Date
zhangxinyuehfad	bc5ca2c856	[0.18.0][Bugfix] Restore VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to original value for nightly test (#8794 ) ### What this PR does / why we need it? PR #8618 renamed `VLLM_NIXL_ABORT_REQUEST_TIMEOUT` to `VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT` and simultaneously reduced the timeout value from 300000 to 480 seconds in the nightly test configs. The 480s value is far too short for heavy multi-node workloads (DeepSeek V3/R1 under W8A8 + EP), causing [spurious abort-request timeouts](https://github.com/vllm-project/vllm-ascend/actions/runs/25067539406/job/73441223206) in CI. This PR restores the timeout value to the original 300000 to fix the nightly test failures introduced by #8618. Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2026-04-29 14:31:12 +08:00
herizhen	ff76c6780e	[releases/v0.18.0][Doc][Misc] Modifying Configuration Parameters (#8618 ) ### What this PR does / why we need it? This PR renames the environment variable VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to align with the Mooncake connector naming convention. It also updates the documentation and test configurations to reflect this change and adjusts the suggested timeout value in the documentation to 480 seconds for consistency. ### Does this PR introduce _any_ user-facing change? Yes. The environment variable for configuring the abort request timeout has been renamed. Users should update their environment settings from VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT. ### How was this patch tested? The changes were verified by updating the corresponding test configuration files and ensuring consistency across the documentation. --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>	2026-04-23 16:23:31 +08:00
pz1116	3effc4bc70	[Doc][KV Pool]Revision KV Pool User Guide (#7434 ) ### What this PR does / why we need it? Revise the KV Pool user guide: 1. Revise Mooncake environment variables and kvconnector extra configs. 2. Delete `use_ascend_direct` in kv connector extra config as it is deprecated 3. Delete `kv_buffer_device` and `kv_rank` in P2P mooncake config 4. Unifies default `max-model-len` and `max-num-batch-tokens` in examples given. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.17.0 - vLLM main: `4497431df6` --------- Signed-off-by: Pz1116 <zpbzpb123123@gmail.com> Co-authored-by: Chao Lei <leichao139636@163.com>	2026-03-19 10:13:13 +08:00
zhangxinyuehfad	566c367a10	[CI] Add DeepSeek-V3.2 large EP nightly ci (#6378 ) ### What this PR does / why we need it? Add DeepSeek-V3.2 nightly ci Fix PD routing to exclude headless nodes when collecting prefiller/decoder IPs - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2026-03-04 16:15:56 +08:00

4 Commits