xc-llm-ascend

Author	SHA1	Message	Date
Qiu	38cfcd572a	[doc](cp) correct the prefill of GQA and adjust desc of block table. (#5697 ) ### What this PR does / why we need it? correct the seq length of KV for prefill of GQA and clarify the desc of block table distribution in developer guide. - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` --------- Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>	2026-01-19 18:53:48 +08:00
Qiu	b10ef9b9f3	[docs] Correct image about prefill phase of PCP (#5598 ) ### What this PR does / why we need it? Remove the incorrectly depicted DCP all_gather operation in the prefill stage PCP for GQA diagram. Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>	2026-01-05 20:21:59 +08:00
InSec	7cf65d0581	[Doc]modify the quantization user guide and add a quantization adaptation developer guide (#5554 ) ### What this PR does / why we need it? This PR makes the following modifications: 1.delete the `user_guide/feature_guide/quantization-llm-compressor.md` and merge it into `user_guide/feature_guide/quantization.md`. 2.update the content of `user_guide/feature_guide/quantization.md`. 3.add guidance `developer_guide/feature_guide/quantization.md' on the adaptation of quantization algorithms and quantized models. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: `7157596103` --------- Signed-off-by: IncSec <1790766300@qq.com> Signed-off-by: InSec <1790766300@qq.com>	2026-01-05 09:12:11 +08:00
Qiu	da0b113cf5	[doc]<PCP&DCP> add developer guide for PCP&DCP (#5372 ) ### What this PR does / why we need it? add developer guide for PCP&DCP - vLLM version: release/v0.13.0 - vLLM main: `bc0a5a0c08` Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>	2025-12-26 16:17:38 +08:00
wangxiaoteng888	b1a00e0512	[docs] [P/D] add feature guide for disaggregated-prefill (#3950 ) ### What this PR does / why we need it? add feature guide for disaggregated-prefill ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? by ci - vLLM version: v0.11.0 - vLLM main: `83f478bb19` --------- Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com> Signed-off-by: liziyu <liziyu16@huawei.com> Signed-off-by: wangxiaoteng888 <56506195+wangxiaoteng888@users.noreply.github.com> Co-authored-by: liziyu <liziyu16@huawei.com>	2025-11-10 09:31:30 +08:00
offline893	5cff3069f4	[Doc]Add developer guide of eplb. (#3759 ) ### What this PR does / why we need it? Add developer guide of eplb - vLLM version: v0.11.0 - vLLM main: `83f478bb19` --------- Signed-off-by: offline0806 <3337230449@qq.com> Co-authored-by: offline0806 <3337230449@qq.com>	2025-11-05 18:35:41 +08:00
Li Wang	7f73c28a24	[CI][Doc] Optimize multi-node CI (#3565 ) ### What this PR does / why we need it? This pull request mainly do the following things: 1. Add a doc for multi-node CI, The main content is the mechanism principle and how to contribute 2. Simplify the config yaml for more developer-friendly 3. Optimized the mooncake installation script to prevent accidental failures during installation 4. Fix the workflow to ensure the kubernetes can be apply correctly 5. Add Qwen3-235B-W8A8 disaggregated_prefill test 6. Add GLM-4.5 multi dp test 7. Add 2p1d 4nodes disaggregated_prefill test 8. Refactor nightly tests ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: `17c540a993` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-10-25 09:23:47 +08:00
Li Wang	bf84f2dbfa	[Doc] Support kimi-k2-w8a8 (#2162 ) ### What this PR does / why we need it? In fact, the kimi-k2 model is similar to the deepseek model, and we only need to make a few changes to support it. what does this pr do: 1. Add kimi-k2-w8a8 deployment doc 2. Update quantization doc 3. Upgrade torchair support list ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.0 - vLLM main: `9edd1db02b` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2025-08-06 19:28:47 +08:00
Li Wang	0c4aa2b4f1	[Doc] Add multi node data parallel doc (#1685 ) ### What this PR does / why we need it? add multi node data parallel doc ### Does this PR introduce _any_ user-facing change? add multi node data parallel doc ### How was this patch tested? - vLLM version: v0.9.1 - vLLM main: `805d62ca88` Signed-off-by: wangli <wangli858794774@gmail.com>	2025-07-10 09:36:37 +08:00

9 Commits