【doc fix】doc fix: deepseekv3.1 (#4645)

### What this PR does / why we need it?
fix deepseekv3.1 doc to recomand developers to use Mooncake instead of LLMDatadist

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->

### How was this patch tested?
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->

Signed-off-by: AiChiMomo <1092626063@qq.com>
This commit is contained in:
1092626063
2025-12-02 21:49:13 +08:00
committed by GitHub
parent 1b5513aa91
commit b84c9afbf5

View File

@@ -254,7 +254,7 @@ vllm serve /weights/DeepSeek-V3.1_w8a8mix_mtp \
### Prefill-Decode Disaggregation
There are two ways to deploy `Prefill-Decode Disaggregation`: [Llmdatadist](./multi_node_pd_disaggregation_llmdatadist.md) and [Mooncake](./multi_node_pd_disaggregation_mooncake.md). We recommend use Mooncake for deploy.
We recommend using Mooncake for deployment: [Mooncake](./multi_node_pd_disaggregation_mooncake.md).
Take Atlas 800 A3 (64G × 16) for example, we recommend to deploy 2P1D (4 nodes) rather than 1P1D (2 nodes), because there is no enough NPU memory to serve high concurrency in 1P1D case.
- `DeepSeek-V3.1_w8a8mix_mtp 2P1D Layerwise` require 4 Atlas 800 A3 (64G × 16).