xc-llm-ascend

Author	SHA1	Message	Date
herizhen	ff76c6780e	[releases/v0.18.0][Doc][Misc] Modifying Configuration Parameters (#8618 ) ### What this PR does / why we need it? This PR renames the environment variable VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT to align with the Mooncake connector naming convention. It also updates the documentation and test configurations to reflect this change and adjusts the suggested timeout value in the documentation to 480 seconds for consistency. ### Does this PR introduce _any_ user-facing change? Yes. The environment variable for configuring the abort request timeout has been renamed. Users should update their environment settings from VLLM_NIXL_ABORT_REQUEST_TIMEOUT to VLLM_MOONCAKE_ABORT_REQUEST_TIMEOUT. ### How was this patch tested? The changes were verified by updating the corresponding test configuration files and ensuring consistency across the documentation. --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>	2026-04-23 16:23:31 +08:00
dsxsteven	325cb16e3f	[BugFix][CI]Fix DeepSeek-R1-W8A8-longseq nightly CI (#6297 ) ### What this PR does / why we need it? The precision issue arose because the kv cache of the p-node had not been fetched for an extended period(>6min) and was forcibly freed. To avoid this problem, the batch size was reduced and the timeout period has also been extended. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` Signed-off-by: dsxsteven <dsxsteven@sina.com>	2026-01-28 16:36:24 +08:00
lty	295018ec0f	[Refactor]Refactor of vllm_ascend/distributed module (#5719 ) ### What this PR does / why we need it? Based on the RFC:https://github.com/vllm-project/vllm-ascend/issues/5604 This PR is a refactoring of vllm_ascend/distributed, moving all kv_transfer realtaed codes into a dedicated folder, which has already been done in vLLM ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` --------- Signed-off-by: lty <linhebiwen@gmail.com>	2026-01-15 08:57:40 +08:00
Nengjun Ma	297f6deb09	[CI] Align multi-node nightly test paramter with corresponding tutorials document (#5756 ) ### What this PR does / why we need it? Align multi-node nightly test paramter with tutorials documents. ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? Test locally and nighly e2e multi-node test cases. - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` --------- Signed-off-by: leo-pony <nengjunma@outlook.com>	2026-01-12 09:00:31 +08:00
dsxsteven	129ba9fe1b	[BugFix] Fix Smoke Testing Bug for DSR1 longseq (#5613 ) ### What this PR does / why we need it? Fix Smoke Testing Bug for DSR1 longseq We need to make this change because the daily smoke test case is throwing an error: "max_tokens or max_completion_tokens is too large: 32768.This model's maximum context length is 32768 tokens and your request has 128 input tokens". We encounter this error due to max-out-len equals to max-model-len. We can fix this error by increasing max-model-len argument in the script. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: `7157596103` Signed-off-by: daishixun <dsxsteven@sina.com>	2026-01-05 22:40:28 +08:00
dsxsteven	37fd48bee5	[CI] Move longseq Nightly CI (#5577 ) ### What this PR does / why we need it? move longseq nightly CI to correct path due to #5479 [1/N] Refactor nightly test structure Signed-off-by: daishixun <dsxsteven@sina.com>	2026-01-04 15:42:43 +08:00

6 Commits