Commit Graph

10 Commits

Author SHA1 Message Date
Shanshan Shen
e3eefdecbd [Doc] Update max_tokens to max_completion_tokens in all docs (#6248)
### What this PR does / why we need it?

Fix:

```
DeprecationWarning: max_tokens is deprecated in favor of the max_completion_tokens field.
```

- vLLM version: v0.14.1
- vLLM main:
d68209402d

Signed-off-by: shen-shanshan <467638484@qq.com>
2026-01-26 11:57:40 +08:00
Nengjun Ma
ab676413e6 Default enable MLAPO (#5952)
### What this PR does / why we need it?
1) Default enable MLAPO for deepseek MLA Attention W8A8 models on PD
disagregation D Instance, for example: DeepSeekV3-W8A8,
DeepSeek-R1-W8A8.
2) Default enable MLAPO for DeepSeek SFA Attention W8A8 models,
currently is DeepSeek-V3.2-W8A8.

### Does this PR introduce _any_ user-facing change?
Don't need use manully to VLLM_ASCEND_ENABLE_MLAPO=1, to enable MLAPO
feature for deepseek w8a8 model

The effect of enabling MLAPO SFA model deployed on a single A3 Node:
Test
with:tests/e2e/nightly/single_node/models/test_deepseek_v3_2_exp_w8a8.py
dataset: gsm8k-lite,without set MTP, FULL GRAPH, has 19% promote:
未默认开启 MLAPO 时:
├─────────────────────────┤
│                TTFT                      │ 14055.8836 ms   │
├─────────────────────────┤
│                ITL                         │ 66.8171 ms.          │
├─────────────────────────┤
│ Output Token Throughput  │ 104.9105 token/s │
├─────────────────────────┤
默认开启 MLAPO 时:
├─────────────────────────┤
│                TTFT                      │ 3753.1547 ms   │
├─────────────────────────┤
│                ITL.                        │ 61.4236  ms.       │
├─────────────────────────┤
│ Output Token Throughput  │ 125.2075 token/s│
├─────────────────────────┤

- vLLM version: v0.13.0
- vLLM main:
2c24bc6996

---------

Signed-off-by: leo-pony <nengjunma@outlook.com>
2026-01-22 09:26:39 +08:00
SILONG ZENG
4811ba62e0 [Lint]Style: reformat markdown files via markdownlint (#5884)
### What this PR does / why we need it?
reformat markdown files via markdownlint

- vLLM version: v0.13.0
- vLLM main:
bde38c11df

---------

Signed-off-by: root <root@LAPTOP-VQKDDVMG.localdomain>
Signed-off-by: MrZ20 <2609716663@qq.com>
Co-authored-by: root <root@LAPTOP-VQKDDVMG.localdomain>
2026-01-15 09:06:01 +08:00
MengLong Chen
b8b5521f5b [Doc] Update DeepSeek V3.1/R1 2P1D doc (#5387)
### What this PR does / why we need it?
The PR updates the documentation for DeepSeek-V3.1 and DeepSeek-R1 in
the scenario of prefill-decode disaggregation.

Updated some PD separation-related setting parameters and optimal
configurations. This script has been verified.

- vLLM version: release/v0.13.0
- vLLM main:
bc0a5a0c08

Signed-off-by: chenmenglong <chenmenglong1@huawei.com>
2025-12-27 17:28:43 +08:00
Zhu Yi Lin
04104031d0 [Doc] Modify DeepSeek-R1/V3.1 documentation (#5426)
### What this PR does / why we need it?
Modify DeepSeek-R1/V3.1 documentation. Mainly update the mtp size and some other configs.

Signed-off-by: GDzhu01 <809721801@qq.com>
2025-12-27 17:13:58 +08:00
Zhu Yi Lin
be2a947521 [Doc] delete environment variable HCCL_OP_EXPANSION_MODE in DeepSeekV3.1/R1 (#5419)
### What this PR does / why we need it?
Currently, HCCL_OP_EXPANSION_MODE="AIV" is causing some freezing issues
on A2.so we have temporarily removed it from the documentation.

Signed-off-by: GDzhu01 <809721801@qq.com>
2025-12-27 12:44:50 +08:00
Zhu Yi Lin
06732dbf5b [Doc] update R1/V3.1 doc (#5383)
### What this PR does / why we need it?
This PR updates DeepSeek-R1/V3.1 doc to give a simple recipe for
repreducing our latest perfomance on Atlas A3/A2 servers.
### Does this PR introduce any user-facing change?
No.

Signed-off-by: GDzhu01 <809721801@qq.com>
2025-12-26 17:09:22 +08:00
1092626063
f952de93df 【Doc】Deepseekv3.1/R1 doc enhancement (#4827)
### What this PR does / why we need it?

Deepseekv3.1、DeepSeekR1 doc enhancement

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: 1092626063 <1092626063@qq.com>
2025-12-19 10:52:33 +08:00
wangxiyuan
e538fa6f9c [Doc] Update tutorial index (#4920)
Update tutorial index and remove useless doc

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-12-11 20:53:13 +08:00
Leaf
89a8607b30 add DeepSeek-R1 tutorial. (#4666)
### What this PR does / why we need it?

This PR adds tutorials for the DeepSeeK-R1 series models, including the
A2 and A3 series, and provides accuracy validation results.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: Gongdayao <gongdayao@foxmail.com>
2025-12-11 08:52:27 +08:00