xc-llm-ascend

Author	SHA1	Message	Date
pz1116	6fc190b44a	[Doc][KV Pool]Revision KV Pool User Guide [2/2] (#7456 ) ### What this PR does / why we need it? Revise the KV Pool user guide: 4. Revise parameters for Memcache for better clarity, at notification that currently heterogeneous protocol setting is not supported (e.g. enable `device_rdma` and `device_sdma` at the same time, a example scenario would be data transfer by memcache across different super pods) 5. Modify the condition for Mooncakestore warmup, warmup is now needed only when `ASCEND_BUFFER_POOL` is enabled. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.17.0 - vLLM main: `8a680463fa` --------- Signed-off-by: Pz1116 <zpbzpb123123@gmail.com> Co-authored-by: Chao Lei <leichao139636@163.com>	2026-03-19 16:17:34 +08:00
pz1116	3effc4bc70	[Doc][KV Pool]Revision KV Pool User Guide (#7434 ) ### What this PR does / why we need it? Revise the KV Pool user guide: 1. Revise Mooncake environment variables and kvconnector extra configs. 2. Delete `use_ascend_direct` in kv connector extra config as it is deprecated 3. Delete `kv_buffer_device` and `kv_rank` in P2P mooncake config 4. Unifies default `max-model-len` and `max-num-batch-tokens` in examples given. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.17.0 - vLLM main: `4497431df6` --------- Signed-off-by: Pz1116 <zpbzpb123123@gmail.com> Co-authored-by: Chao Lei <leichao139636@163.com>	2026-03-19 10:13:13 +08:00
pz1116	a7820d20f4	[Doc][KV Pool]Update Memcache local service config example: increase default world size to 256 and update description (#7025 ) ### What this PR does / why we need it? Update Memcache local service config example: increase default world size to 256 and update the description for better clarity. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: `4034c3d32e` Signed-off-by: Pz1116 <zpbzpb123123@gmail.com>	2026-03-06 10:23:55 +08:00
fems14	ae394767d4	【main】ADXL/HIXL supports FabricMem Mode (#6806 ) ### What this PR does / why we need it? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: `83b47f67b1` --------- Signed-off-by: fems14 <1804143737@qq.com>	2026-03-05 21:04:11 +08:00
wangxiyuan	a95c0b8b82	[Doc] fix the nit in docs (#6826 ) Refresh the doc, fix the nit in the docs - vLLM version: v0.15.0 - vLLM main: `83b47f67b1` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-02-27 11:50:27 +08:00
DreamerLeader	905f0764e0	[DOC]Add Memcache Usage Guide (#6476 ) ### What this PR does / why we need it? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` --------- Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local> Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local> Co-authored-by: Pz1116 <zpbzpb123123@gmail.com>	2026-02-09 21:55:00 +08:00
Li Wang	d018aeb5fa	[Image] Bump mooncake version to v0.3.8.post1 (#6428 ) ### What this PR does / why we need it? This patch bump the mooncake version to the latest [release](https://github.com/kvcache-ai/Mooncake/releases/tag/v0.3.8.post1) ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? test is locally >>> from mooncake.engine import TransferEngine - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2026-02-06 10:54:03 +08:00
DreamerLeader	2dac18afea	[Bugfix]Fix of Pooling Code and Update of Pooling Usage Guide (#6126 ) ### What this PR does / why we need it? Fix of Pooling Code and Update of Pooling Usage Guide ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? pr:[[Bugfix]Fixed precision issues caused by pooled request pooling](https://github.com/vllm-project/vllm-ascend/pull/6049) readyhttps://github.com/vllm-project/vllm-ascend/pull/6049 read for review - vLLM version: v0.13.0 - vLLM main: `d68209402d` --------- Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local> Signed-off-by: fangjianwei <f30058701@china.huawei.com> Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com> Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local> Co-authored-by: fangjianwei <f30058701@china.huawei.com>	2026-02-04 16:35:41 +08:00
Shanshan Shen	e3eefdecbd	[Doc] Update `max_tokens` to `max_completion_tokens` in all docs (#6248 ) ### What this PR does / why we need it? Fix: ``` DeprecationWarning: max_tokens is deprecated in favor of the max_completion_tokens field. ``` - vLLM version: v0.14.1 - vLLM main: `d68209402d` Signed-off-by: shen-shanshan <467638484@qq.com>	2026-01-26 11:57:40 +08:00
Li Wang	4d780a8b01	[Misc] Revert "[Misc] Bump mooncake version to v0.3.8.post1 (#6110 )" (#6164 ) ### What this PR does / why we need it? The new version of moonkcake lead to the image build failure. see https://github.com/vllm-project/vllm-ascend/actions/runs/21236469259/job/61105443733, we should revert it first ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: `d68209402d` Signed-off-by: wangli <wangli858794774@gmail.com>	2026-01-23 09:53:32 +08:00
Li Wang	37a9cf818a	[Misc] Bump mooncake version to v0.3.8.post1 (#6110 ) ### What this PR does / why we need it? Since the mooncake has the newer [release](https://github.com/kvcache-ai/Mooncake/releases/tag/v0.3.8.post1), we pin the tag to latest release ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: `d68209402d` Signed-off-by: wangli <wangli858794774@gmail.com>	2026-01-22 11:03:16 +08:00
SILONG ZENG	4811ba62e0	[Lint]Style: reformat markdown files via markdownlint (#5884 ) ### What this PR does / why we need it? reformat markdown files via markdownlint - vLLM version: v0.13.0 - vLLM main: `bde38c11df` --------- Signed-off-by: root <root@LAPTOP-VQKDDVMG.localdomain> Signed-off-by: MrZ20 <2609716663@qq.com> Co-authored-by: root <root@LAPTOP-VQKDDVMG.localdomain>	2026-01-15 09:06:01 +08:00
liziyu	451bbdc292	[Doc] add tls check to pd disaggregation readme (#5638 ) ### What this PR does / why we need it? update pd disaggregation multi_node readme, update the environment check command for A3, add tls check ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: `8be6432bda` Signed-off-by: liziyu <liziyu16@huawei.com>	2026-01-12 15:49:18 +08:00
baxingpiaochong	46c2fc6a3c	[KVPOOL]decode save kvcache (#5168 ) ### What this PR does / why we need it? kvpool decode save kvcache now only support mla ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: baxingpiaochong <771405853@qq.com> Co-authored-by: Chao Lei <leichao139636@163.com>	2026-01-04 22:22:01 +08:00
fems14	2ef4d1979e	[bugfix][main]KV Pool for KV Transfer in PD Disaggregation Scenarios (#5398 ) ### What this PR does / why we need it? 1.KV Pool for KV Transfer in PD Disaggregation Scenarios Error Resolution 2.Update KV Pool Documentation ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: release/v0.13.0 - vLLM main: `254f6b9867` --------- Signed-off-by: fems14 <1804143737@qq.com>	2025-12-27 09:53:57 +08:00
liziyu	190ae55e9f	Add a Mooncake installation tutorial for kv pool and update Mooncake installation tutorial (#5069 ) ### What this PR does / why we need it? Add a Mooncake installation tutorial for kv pool and update Mooncake installation tutorial - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: liziyu <liziyu16@huawei.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-12-16 19:53:23 +08:00
Chao Lei	b75bfc58f6	[Doc ] Supplement kvpool user guide (#5013 ) ### What this PR does / why we need it? Supplement detailed descriptions for `ASCEND_CONNECT_TIMEOUT` and `ASCEND_TRANSFER_TIMEOUT` in kvpool. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: LCAIZJ <leichao139636@163.com>	2025-12-15 14:24:39 +08:00
wangxiyuan	42ceaf08a1	add release note for 0.12.0 (#4995 ) Add release note for v0.12.0rc1 Update deepseek3.2 tutorial doc - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-13 22:09:59 +08:00
lilinsiman	fc818f1509	[doc][main] Correct mistakes in doc (#4945 ) ### What this PR does / why we need it? Correct mistakes in doc - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: lilinsiman <lilinsiman@gmail.com>	2025-12-12 19:17:10 +08:00
liziyu	688b1332da	[P/D] check kv extra config and del hccl backend (#4547 ) ### What this PR does / why we need it? check kv extra config & del hccl backend - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: liziyu <liziyu16@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-07 15:19:42 +08:00
herizhen	bb1610dc25	add hyperlink (#4588 ) ### What this PR does / why we need it? add hyperlink ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ut - vLLM version: v0.11.2 --------- Signed-off-by: herizhen <you@example.com> Co-authored-by: herizhen <you@example.com>	2025-12-02 14:09:03 +08:00
fems14	5447a039b9	[Feature][main]reconstruction kvpool connector to ascend connector (#4438 ) ### What this PR does / why we need it? 1.In short, we renamed the existing MooncakeStoreConnector to AscendStoreConnector and extracted the storage engine interaction logic into a new Backend class. Associated RFC：https://github.com/vllm-project/vllm-ascend/issues/4329 2.Fixed the issue where the number of input parameters for the connector was incorrect, introduced in vllm 0.11.2 ### Does this PR introduce _any_ user-facing change? change MooncakeStoreConnector to AscendStoreConnector ### How was this patch tested? - vLLM version: v0.11.2 --------- Signed-off-by: fems14 <1804143737@qq.com>	2025-11-28 18:08:37 +08:00

22 Commits