xc-llm-ascend

Author	SHA1	Message	Date
lty	dee00d0de3	[Usability]local_buffer_size support for units: GB, MB, KB, B (#4829 ) What this PR does / why we need it? Improve usability，local_buffer_size support for units: GB, MB, KB, B, For example, "2GB" { "local_hostname": "XXX.XXX.XXX.XXX", "metadata_server": "P2PHANDSHAKE", "protocol": "ascend", "device_name": "", "use_ascend_direct": true, "master_server_address": "XXX.XXX.XXX.XXX:50088", "global_segment_size": 60000000000, "local_buffer_size": "2GB" } Does this PR introduce any user-facing change? local_buffer_size support for units: GB, MB, KB, B How was this patch tested? Mooncake configures local_buffer_size as GB, MB, KB, B - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: lty <linxianchong1@huawei.com>	2025-12-09 17:52:24 +08:00
baxingpiaochong	dda027e680	[KVPOOl]Support pp (#4761 ) ### What this PR does / why we need it? Support pp for kv pool - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: baxingpiaochong <771405853@qq.com>	2025-12-09 16:15:26 +08:00
liziyu	688b1332da	[P/D] check kv extra config and del hccl backend (#4547 ) ### What this PR does / why we need it? check kv extra config & del hccl backend - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: liziyu <liziyu16@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-07 15:19:42 +08:00
LookAround0301	b32ef53b3b	[long_seq] remove long_seq env (#4660 ) ### What this PR does / why we need it? remove env VLLM_ASCEND_ENABLE_CONTEXT_PARALLEL - vLLM version: v0.12.0 --------- Signed-off-by: LookAround <lixushi@huawei.com> Signed-off-by: ZhangMingWei716 <2894054457@qq.com> Co-authored-by: ZhangMingWei716 <2894054457@qq.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-05 10:31:49 +08:00
wangxiyuan	7f2673ea2d	upgrade vLLM to main (#4608 ) 1. fix https://github.com/vllm-project/vllm/pull/28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix https://github.com/vllm-project/vllm/pull/29121 the output token now type changed from np to `list[list[int]]` 3. fix https://github.com/vllm-project/vllm/pull/29262 `xformers` backend for multimodal now has been deprecated 4. fix https://github.com/vllm-project/vllm/pull/29342 5. fix https://github.com/vllm-project/vllm/pull/28579 6. fix https://github.com/vllm-project/vllm/pull/28718 7. fix https://github.com/vllm-project/vllm/issues/28665 8. fix https://github.com/vllm-project/vllm/pull/26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix https://github.com/vllm-project/vllm/pull/29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>	2025-12-02 22:10:52 +08:00
Slightwind	aa56a0f4b7	[Bugfix] PCP adaptation for VLLM v0.11.2 modifications (#4604 ) To adapt to the vLLM v0.11.2 image, the method for obtaining PCP size and DCP size has been modified. ___ - vLLM version: v0.11.2 --------- Signed-off-by: SlightwindSec <slightwindsec@gmail.com>	2025-12-01 19:20:32 +08:00
Chao Lei	ff7061317f	[Bugfix] Fix kvpool precision synchronization (#4574 ) ### What this PR does / why we need it? Fix kvpool precision synchronization Issue https://github.com/vllm-project/vllm-ascend/issues/4412 - vLLM version: v0.11.2 --------- Signed-off-by: LCAIZJ <leichao139636@163.com>	2025-11-30 09:39:07 +08:00
DreamerLeader	4dbe4fd123	[feature]Pooling Features and PCP Adaptation (#4143 ) This PR let pooling kv connector support pcp feature - vLLM version: v0.11.2 --------- Signed-off-by: fjw <2270923832@qq.com> Signed-off-by: SlightwindSec <slightwindsec@gmail.com> Co-authored-by: SlightwindSec <slightwindsec@gmail.com>	2025-11-29 22:07:45 +08:00
fems14	5447a039b9	[Feature][main]reconstruction kvpool connector to ascend connector (#4438 ) ### What this PR does / why we need it? 1.In short, we renamed the existing MooncakeStoreConnector to AscendStoreConnector and extracted the storage engine interaction logic into a new Backend class. Associated RFC：https://github.com/vllm-project/vllm-ascend/issues/4329 2.Fixed the issue where the number of input parameters for the connector was incorrect, introduced in vllm 0.11.2 ### Does this PR introduce _any_ user-facing change? change MooncakeStoreConnector to AscendStoreConnector ### How was this patch tested? - vLLM version: v0.11.2 --------- Signed-off-by: fems14 <1804143737@qq.com>	2025-11-28 18:08:37 +08:00

9 Commits