xc-llm-ascend

Author	SHA1	Message	Date
ming1212	98b9e2e18e	Add Qwen3-Next tutorials (#4607 ) ### What this PR does / why we need it? This PR provides an introduction to the Qwen3-Next model, details on the features supported by the model in the current version, the model deployment process, as well as methods for performance testing and accuracy testing. With this document, the deployment and testing of the Qwen3-Next model can be implemented more easily. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: ming1212 <2717180080@qq.com> Signed-off-by: ming1212 <104972349+ming1212@users.noreply.github.com> Signed-off-by: Mengqing Cao <cmq0113@163.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-12-15 11:48:22 +08:00
Li Wang	2497bbbaf6	[Misc] Update pooling example (#5002 ) ### What this PR does / why we need it? Since the param `task` has been depprecated, we should use the latest unified standard parameters for pooling models, this should be more clear - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-12-15 08:36:19 +08:00
wangxiyuan	42ceaf08a1	add release note for 0.12.0 (#4995 ) Add release note for v0.12.0rc1 Update deepseek3.2 tutorial doc - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-13 22:09:59 +08:00
lilinsiman	31c94b7e7b	[doc][main] Correct more doc mistakes (#4958 ) ### What this PR does / why we need it? Correct more doc mistakes - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: lilinsiman <lilinsiman@gmail.com>	2025-12-13 18:36:58 +08:00
lilinsiman	fc818f1509	[doc][main] Correct mistakes in doc (#4945 ) ### What this PR does / why we need it? Correct mistakes in doc - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: lilinsiman <lilinsiman@gmail.com>	2025-12-12 19:17:10 +08:00
liziyu	716c4dacfe	update qwen2.5vl readme (#4938 ) ### What this PR does / why we need it? fix qwen2.5vl readme, del gen ranktable and add install mooncake - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: liziyu <liziyu16@huawei.com>	2025-12-12 15:40:07 +08:00
wangxiyuan	e538fa6f9c	[Doc] Update tutorial index (#4920 ) Update tutorial index and remove useless doc - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-11 20:53:13 +08:00
yangxiaoman8	e1bb6f47ec	[doc] Add Qwen2.5 tutorials (#4636 ) ### What this PR does / why we need it? Add qwen2.5 turorial - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: yangshihao6 <yangshihao6@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-11 17:30:05 +08:00
wangxiyuan	bb76f7962c	cleanup useless torchair logic (#4856 ) This PR clean up useless torchair logic in model runner. The moge doc is only for torchair, it can be removed as well. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-12-11 11:21:13 +08:00
SILONG ZENG	ff7d703192	[Doc]Add tutorial document for qwen-VL-Dense (#3516 ) ### What this PR does / why we need it? This document employs the qwen3-vl-8b model and qwen2.5-vl-32b to demonstrate the primary verification steps for the Qwen-VL series dense models, including supported features, feature configuration, environment preparation, NPU deployment, and accuracy and performance evaluation. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: MrZ20 <2609716663@qq.com>	2025-12-11 08:55:23 +08:00
Leaf	89a8607b30	add DeepSeek-R1 tutorial. (#4666 ) ### What this PR does / why we need it? This PR adds tutorials for the DeepSeeK-R1 series models, including the A2 and A3 series, and provides accuracy validation results. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: Gongdayao <gongdayao@foxmail.com>	2025-12-11 08:52:27 +08:00
wangxiyuan	c77dca54b2	[CI] fix lint (#4888 ) Fix lint CI error Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-10 16:57:24 +08:00
wind-all	1a443f2772	add multi_npu_qwen3_dense tutorials (#4543 ) ### What this PR does / why we need it? This PR adds tutorials for the Qwen3-Dense series models, including the A2 and A3 series, and provides accuracy validation results. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wind-all <anyuting@h-partners.com>	2025-12-10 16:09:56 +08:00
Ruri	ce5872705e	[Feat] Support native Kimi-K2-Thinking native W4A16 quantized experts weights (#4516 ) ### What this PR does / why we need it? Adds W4A16 quantization method for the Kimi-K2-Thinking model and updates relevant modules to support the new quantization method. - Implements complete W4A16 quantization method including weight packing/unpacking, per-group quantization parameter generation, post-processing logic and MoE method application. - Adds parameters `use_int4_w4a16`, `w1_offset` and `w2_offset`, adjusts `with_quant` conditional logic to support W4A16 matrix multiplication. - Adds `packed_modules_model_mapping` for Kimi-K2-Thinking model and processing logic for `weight_packed` field. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> Signed-off-by: Ruri <33858552+zhoux77899@users.noreply.github.com> Signed-off-by: Ruri <zhouxiang100@huawei.com>	2025-12-10 15:58:52 +08:00
wangxiyuan	835b4c8f1d	Drop torchair (#4814 ) aclgraph is stable and fast now. Let's drop torchair graph mode now. TODO: some logic to adapt torchair should be cleaned up as well. We'll do it in the following PR. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-12-10 09:20:40 +08:00
wangxiaoteng888	a77045f355	[P/D][main]Offline the llmdatadist connector related parts of the code and files. (#4780 ) ### What this PR does / why we need it? As support for the mooncake connector is now available, the llmdatadist connector is no longer being maintained, so the llmdatadist-related files need to be retired. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: liziyu <liziyu16@huawei.com> Co-authored-by: liziyu <liziyu16@huawei.com>	2025-12-09 22:36:43 +08:00
linfeng-yuan	56f01820e8	[Docs]fix the configuration conflicts in documentation (#4823 ) ### What this PR does / why we need it? Fix configuration error in our documentations. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? NA. Signed-off-by: linfeng-yuan <1102311262@qq.com>	2025-12-09 15:37:38 +08:00
xuyexiong	193dc1703f	[Doc] Add Qwen3-235B tutorial (#4358 ) ### What this PR does / why we need it? Add Qwen3-235B tutorial including the following examples - Single-node Online Deployment for 128k context inference - Multi-node Deployment with MP - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: xuyexiong <xuyexiong@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-08 20:06:46 +08:00
liziyu	688b1332da	[P/D] check kv extra config and del hccl backend (#4547 ) ### What this PR does / why we need it? check kv extra config & del hccl backend - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: liziyu <liziyu16@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-07 15:19:42 +08:00
mazhixin000	3740b3edfc	【main】[Doc]add 2P1D instruction for single node (#4716 ) ### What this PR does / why we need it? Add the description for 2P1D， keeping it consistent with the content in the dev branch. ### Does this PR introduce _any_ user-facing change? no - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.12.0 Signed-off-by: mazhixin000 <mazhixinkorea@163.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-05 18:35:18 +08:00
1092626063	b84c9afbf5	【doc fix】doc fix: deepseekv3.1 (#4645 ) ### What this PR does / why we need it? fix deepseekv3.1 doc to recomand developers to use Mooncake instead of LLMDatadist ### Does this PR introduce _any_ user-facing change? <!-- Note that it means any user-facing change including all aspects such as API, interface or other behavior changes. Documentation-only updates are not considered user-facing changes. --> ### How was this patch tested? <!-- CI passed with new added/existing test. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. --> Signed-off-by: AiChiMomo <1092626063@qq.com>	2025-12-02 21:49:13 +08:00
1092626063	eabedf43aa	[Doc] Refactor the DeepSeek-V3.1 tutorial. (#4399 ) ### What this PR does / why we need it? Refactor the DeepSeek-V3.1 tutorial. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: 1092626063 <1092626063@qq.com>	2025-12-02 18:46:30 +08:00
yeyifan	8907010815	[Doc] Add tutorial for Qwen3-Coder-30B-A3B (#4391 ) ### What this PR does / why we need it? Add tutorial for Qwen3-Coder-30B-A3B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: nsdie <yeyifan@huawei.com> Signed-off-by: herizhen <you@example.com> Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com> Signed-off-by: jiangyunfan1 <jiangyunfan1@h-partners.com> Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Signed-off-by: weijinqian_v1 <weijinqian@huawei.com> Signed-off-by: weijinqian0 <1184188277@qq.com> Co-authored-by: Li Wang <wangli858794774@gmail.com> Co-authored-by: herizhen <59841270+herizhen@users.noreply.github.com> Co-authored-by: herizhen <you@example.com> Co-authored-by: Yizhou <136800916+yiz-liu@users.noreply.github.com> Co-authored-by: jiangyunfan1 <jiangyunfan1@h-partners.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: XiaoxinWang <963372609@qq.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: weijinqian0 <1184188277@qq.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>	2025-12-02 16:03:37 +08:00
wangxiyuan	cb33b09179	[Doc]clean up ascend scheduler config from doc (#4612 ) clean up ascend scheduler config from doc - vLLM version: v0.11.2 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-02 14:22:56 +08:00
zhangyiming	c097790370	[Doc] Fix DeepSeek-V3.2-Exp doc, add docker command. (#4479 ) ### What this PR does / why we need it? Fix DeepSeek-V3.2-Exp doc, add docker command. - vLLM version: v0.11.2 Signed-off-by: menogrey <1299267905@qq.com>	2025-12-01 22:29:21 +08:00
Mengqing Cao	517fd9272d	Revert "drop ascend scheduler" (#4580 ) Reverts vllm-project/vllm-ascend#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2	2025-11-29 22:20:48 +08:00
wangxiyuan	f10acddb78	drop ascend scheduler (#4498 ) Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-29 16:18:34 +08:00
wangxiyuan	8ebbf13c1a	Update triton package name (#4563 ) Add `aarch64` suffix to make sure the package name is OK - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-29 15:00:40 +08:00
Ting FU	b747c95cfa	[Doc] Add single NPU tutorial for Qwen2.5-Omni-7B (#4446 ) ### What this PR does / why we need it? Add single NPU tutorial for Qwen2.5-Omni-7B - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Ting FU <futing10@huawei.com>	2025-11-29 11:57:29 +08:00
wangxiyuan	048d350f9e	update triton package url (#4552 ) Triton package url is not correct. This PR fix it Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-28 21:00:49 +08:00
wangxiaoteng888	366d2d95e8	[P/D] Add readme for PD separation (#4182 ) ### What this PR does / why we need it? Add readme for PD separation ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By ci - vLLM version: v0.11.0 - vLLM main: `2918c1b49c` --------- Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: liziyu <liziyu16@huawei.com> Co-authored-by: liziyu <liziyu16@huawei.com>	2025-11-28 15:17:59 +08:00
SILONG ZENG	ab37a7d5ae	[main]Upgrade cann to 8.3rc2 (#4350 ) ### What this PR does / why we need it? Upgrade cann to 8.3rc2 ### Does this PR introduce _any_ user-facing change? Yes, docker image will use 8.3.RC2 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: MrZ20 <2609716663@qq.com>	2025-11-28 14:06:01 +08:00
herizhen	d252e36ae8	Change comment location (#4432 ) ### What this PR does / why we need it? When running 'python example.py',connection issues often occur.The solution is to comment out the first line the code. Complete the specific names of machines A2 and A3. Standardize document format,a space should be added after the colon. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ut - vLLM version: v0.11.2 --------- Signed-off-by: herizhen <you@example.com> Co-authored-by: herizhen <you@example.com>	2025-11-26 16:13:31 +08:00
Li Wang	b5f7a83927	[Doc] Upgrade multi-node doc (#4365 ) ### What this PR does / why we need it? When we are using `Ascend scheduler`, the param `max_num_batched_tokens` should be larger than `max_model_len`, otherwise, will encountered the follow error: ```shell Value error, Ascend scheduler is enabled without chunked prefill feature. Argument max_num_batched_tokens (4096) is smaller than max_model_len (32768). This effectively limits the maximum sequence length to max_num_batched_tokens and makes vLLM reject longer sequences. Please increase max_num_batched_tokens or decrease max_model_len. [type=value_error, input_value=ArgsKwargs((), {'model_co...g': {'enabled': True}}}), input_type=ArgsKwargs] ``` ### Does this PR introduce _any_ user-facing change? Users/Developers who running the model according to the [tutorial](https://docs.vllm.ai/projects/ascend/en/latest/tutorials/multi_node.html), the parameters can be specified correctly. ### How was this patch tested? - vLLM version: v0.11.0 - vLLM main: `2918c1b49c` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-11-24 10:57:50 +08:00
mazhixin000	ab51fcea4c	[Doc]Add single node PD disaggregation instructions (#4337 ) ### What this PR does / why we need it? add single node PD disaggregation instructions for Qwen 2.5VL model. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.11.0 - vLLM main: `2918c1b49c` --------- Signed-off-by: mazhixin <mazhixin7@huawei.com> Signed-off-by: mazhixin000 <mazhixinkorea@163.com> Co-authored-by: mazhixin <mazhixin7@huawei.com>	2025-11-22 23:33:07 +08:00
liziyu	a30261f779	[P/D] pd proxy support ipv6 (#4161 ) ### What this PR does / why we need it? pd proxy support ipv6, mooncake connector check whether the IPv6 address is used and notify the user. - vLLM version: v0.11.0 - vLLM main: `2918c1b49c` --------- Signed-off-by: liziyu <liziyu16@huawei.com>	2025-11-18 11:01:13 +08:00
lilinsiman	adee9dd3b1	[Info][main] Correct the mistake in information documents (#4157 ) ### What this PR does / why we need it? Correct the mistake in information documents ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ut - vLLM version: v0.11.0 - vLLM main: `2918c1b49c` --------- Signed-off-by: lilinsiman <lilinsiman@gmail.com>	2025-11-13 15:53:58 +08:00
zhangyiming	c9e5b90f53	[Doc] Fix DeepSeek-3.2-Exp doc, remove v0.11.0rc0 outdated infos. (#4095 ) ### What this PR does / why we need it? Fix DeepSeek-3.2-Exp doc, remove v0.11.0rc0 outdated infos. - vLLM version: v0.11.0 - vLLM main: `83f478bb19` --------- Signed-off-by: menogrey <1299267905@qq.com>	2025-11-12 09:11:31 +08:00
wangxiyuan	f811a24bf0	Remove VLLM_USE_V1 (#4086 ) Drop VLLM_USE_V1 usage. This env has been removed from vLLM already. - vLLM version: v0.11.0 - vLLM main: `83f478bb19` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-11 15:43:39 +08:00
22dimensions	e6625bb582	[Doc] add qwen3 w4a4 tutorial (#4076 ) ### What this PR does / why we need it? v0.11.0rc1 will introduce w4a4 quantization feature, so add this tutorial. ### Does this PR introduce _any_ user-facing change? No - vLLM version: v0.11.0 - vLLM main: `83f478bb19` Signed-off-by: 22dimensions <waitingwind@foxmail.com>	2025-11-10 20:30:07 +08:00
zhangyiming	a74e76b02d	[Doc] Remove extra MLAPO installation step for DeepSeek-V3.2. (#4024 ) ### What this PR does / why we need it? Remove extra MLAPO installation step for DeepSeek-V3.2. - vLLM version: v0.11.0 - vLLM main: `83f478bb19` Signed-off-by: menogrey <1299267905@qq.com>	2025-11-10 09:09:59 +08:00
lilinsiman	a3ff765c65	[Info][main] Corrected the errors in the information (#4055 ) ### What this PR does / why we need it? Corrected the errors in the information ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ut - vLLM version: v0.11.0 - vLLM main: `83f478bb19` Signed-off-by: lilinsiman <lilinsiman@gmail.com>	2025-11-08 18:48:59 +08:00
Li Wang	259eb25f88	[CI] Quick fix mooncake for nightly-ci (#4028 ) ### What this PR does / why we need it? Since we have upgraded to CANN 8.3rc1, we will no longer use the privately maintained Mooncake repository, but instead use the official release released by Mooncake: https://github.com/kvcache-ai/Mooncake/releases/tag/v0.3.7.post2 . Next step: this is only a temporary solution. We will integrate mooncake into the vllm-ascend base image later for easier use. see https://github.com/vllm-project/vllm-ascend/pull/3989 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0 - vLLM main: `83f478bb19` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-11-06 18:46:00 +08:00
zhangyiming	5f08e07208	[Doc] Refactor the DeepSeek-V3.2-Exp tutorial. (#3871 ) ### What this PR does / why we need it? Refactor the DeepSeek-V3.2-Exp tutorial. - vLLM version: v0.11.0 - vLLM main: `83f478bb19` --------- Signed-off-by: menogrey <1299267905@qq.com>	2025-11-04 18:58:33 +08:00
zxr2333	15bb5098ad	[PD Disaggregation]Set adxl engine as default backend and update README (#3761 ) ### What this PR does / why we need it? Set adxl engine as the default Mooncake backend, because Ascend Transport is no longer maintained. Update README to include instructions for installing the adxl backend Mooncake. ### Does this PR introduce _any_ user-facing change? Users need to compile and install the mooncake backend for adxl according to the revised README instructions. ### How was this patch tested? By CI. - vLLM version: v0.11.0 - vLLM main: `83f478bb19` Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com>	2025-11-04 16:06:39 +08:00
zhangxinyuehfad	789ba4c5c2	[Doc] Update doc (#3836 ) ### What this PR does / why we need it? Update doc ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.1 Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2025-10-29 11:03:39 +08:00
Shanshan Shen	3e5ae49160	[MM][Doc] Update online serving tutorials for `Qwen2-Audio` (#3606 ) ### What this PR does / why we need it? Update online serving tutorials for `Qwen2-Audio`. Part of https://github.com/vllm-project/vllm-ascend/issues/3508. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: shen-shanshan <467638484@qq.com>	2025-10-27 16:58:03 +08:00
zhangyiming	ebfd09a075	[Doc] Update the Pangu Pro MoE tutorials. (#3651 ) ### What this PR does / why we need it? Update the Pangu Pro MoE tutorials. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: menogrey <1299267905@qq.com>	2025-10-23 20:41:47 +08:00
Crazyang	f06a6cad1b	[Doc] Update the modelslim website from gitee to gitcode. (#3615 ) ### What this PR does / why we need it? Because the ModelSlim code repository has migrated from gitee to gitcode, all relevant links in the repository have been updated. [migration notice](https://gitee.com/ascend/msit/tree/master/.%E6%9C%AC%E9%A1%B9%E7%9B%AE%E5%B7%B2%E7%BB%8F%E6%AD%A3%E5%BC%8F%E8%BF%81%E7%A7%BB%E8%87%B3%20Gitcode%20%E5%B9%B3%E5%8F%B0) ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? vLLM version: v0.11.0rc3 vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: Crazyang <im.crazyang@gmail.com> Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com> Co-authored-by: weichen <calvin_zhu0210@outlook.com>	2025-10-23 15:38:16 +08:00
Li Wang	ca104ce6f0	[Doc] Upgrade docker run command (#3645 ) ### What this PR does / why we need it? Update the docker run command, specifically: add --shm-size=1g ### Does this PR introduce _any_ user-facing change? users/developers using docker to pull vllm-ascend, the shared memory of the container will be increased from the default 64MB to 1G ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: wangli <wangli858794774@gmail.com>	2025-10-23 11:17:26 +08:00

1 2 3 4 5

207 Commits