xc-llm-ascend

Author	SHA1	Message	Date
Shanshan Shen	e3eefdecbd	[Doc] Update `max_tokens` to `max_completion_tokens` in all docs (#6248 ) ### What this PR does / why we need it? Fix: ``` DeprecationWarning: max_tokens is deprecated in favor of the max_completion_tokens field. ``` - vLLM version: v0.14.1 - vLLM main: `d68209402d` Signed-off-by: shen-shanshan <467638484@qq.com>	2026-01-26 11:57:40 +08:00
Nengjun Ma	ab676413e6	Default enable MLAPO (#5952 ) ### What this PR does / why we need it? 1) Default enable MLAPO for deepseek MLA Attention W8A8 models on PD disagregation D Instance, for example: DeepSeekV3-W8A8, DeepSeek-R1-W8A8. 2) Default enable MLAPO for DeepSeek SFA Attention W8A8 models, currently is DeepSeek-V3.2-W8A8. ### Does this PR introduce _any_ user-facing change? Don't need use manully to VLLM_ASCEND_ENABLE_MLAPO=1, to enable MLAPO feature for deepseek w8a8 model The effect of enabling MLAPO SFA model deployed on a single A3 Node: Test with:tests/e2e/nightly/single_node/models/test_deepseek_v3_2_exp_w8a8.py dataset: gsm8k-lite，without set MTP, FULL GRAPH, has 19% promote：未默认开启 MLAPO 时： ├─────────────────────────┤ │ TTFT │ 14055.8836 ms │ ├─────────────────────────┤ │ ITL │ 66.8171 ms. │ ├─────────────────────────┤ │ Output Token Throughput │ 104.9105 token/s │ ├─────────────────────────┤ 默认开启 MLAPO 时： ├─────────────────────────┤ │ TTFT │ 3753.1547 ms │ ├─────────────────────────┤ │ ITL. │ 61.4236 ms. │ ├─────────────────────────┤ │ Output Token Throughput │ 125.2075 token/s│ ├─────────────────────────┤ - vLLM version: v0.13.0 - vLLM main: `2c24bc6996` --------- Signed-off-by: leo-pony <nengjunma@outlook.com>	2026-01-22 09:26:39 +08:00
SILONG ZENG	4811ba62e0	[Lint]Style: reformat markdown files via markdownlint (#5884 ) ### What this PR does / why we need it? reformat markdown files via markdownlint - vLLM version: v0.13.0 - vLLM main: `bde38c11df` --------- Signed-off-by: root <root@LAPTOP-VQKDDVMG.localdomain> Signed-off-by: MrZ20 <2609716663@qq.com> Co-authored-by: root <root@LAPTOP-VQKDDVMG.localdomain>	2026-01-15 09:06:01 +08:00
MengLong Chen	b8b5521f5b	[Doc] Update DeepSeek V3.1/R1 2P1D doc (#5387 ) ### What this PR does / why we need it? The PR updates the documentation for DeepSeek-V3.1 and DeepSeek-R1 in the scenario of prefill-decode disaggregation. Updated some PD separation-related setting parameters and optimal configurations. This script has been verified. - vLLM version: release/v0.13.0 - vLLM main: `bc0a5a0c08` Signed-off-by: chenmenglong <chenmenglong1@huawei.com>	2025-12-27 17:28:43 +08:00
Zhu Yi Lin	04104031d0	[Doc] Modify DeepSeek-R1/V3.1 documentation (#5426 ) ### What this PR does / why we need it? Modify DeepSeek-R1/V3.1 documentation. Mainly update the mtp size and some other configs. Signed-off-by: GDzhu01 <809721801@qq.com>	2025-12-27 17:13:58 +08:00
Zhu Yi Lin	be2a947521	[Doc] delete environment variable HCCL_OP_EXPANSION_MODE in DeepSeekV3.1/R1 (#5419 ) ### What this PR does / why we need it? Currently, HCCL_OP_EXPANSION_MODE="AIV" is causing some freezing issues on A2.so we have temporarily removed it from the documentation. Signed-off-by: GDzhu01 <809721801@qq.com>	2025-12-27 12:44:50 +08:00
Zhu Yi Lin	06732dbf5b	[Doc] update R1/V3.1 doc (#5383 ) ### What this PR does / why we need it? This PR updates DeepSeek-R1/V3.1 doc to give a simple recipe for repreducing our latest perfomance on Atlas A3/A2 servers. ### Does this PR introduce any user-facing change? No. Signed-off-by: GDzhu01 <809721801@qq.com>	2025-12-26 17:09:22 +08:00
1092626063	f952de93df	【Doc】Deepseekv3.1/R1 doc enhancement (#4827 ) ### What this PR does / why we need it? Deepseekv3.1、DeepSeekR1 doc enhancement - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: 1092626063 <1092626063@qq.com>	2025-12-19 10:52:33 +08:00
wangxiyuan	e538fa6f9c	[Doc] Update tutorial index (#4920 ) Update tutorial index and remove useless doc - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-12-11 20:53:13 +08:00
Leaf	89a8607b30	add DeepSeek-R1 tutorial. (#4666 ) ### What this PR does / why we need it? This PR adds tutorials for the DeepSeeK-R1 series models, including the A2 and A3 series, and provides accuracy validation results. - vLLM version: v0.12.0 - vLLM main: `ad32e3e19c` --------- Signed-off-by: Gongdayao <gongdayao@foxmail.com>	2025-12-11 08:52:27 +08:00

10 Commits