【doc fix】doc fix: deepseekv3.1 (#4645)
### What this PR does / why we need it? fix deepseekv3.1 doc to recomand developers to use Mooncake instead of LLMDatadist ### Does this PR introduce _any_ user-facing change? <!-- Note that it means *any* user-facing change including all aspects such as API, interface or other behavior changes. Documentation-only updates are not considered user-facing changes. --> ### How was this patch tested? <!-- CI passed with new added/existing test. If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future. If tests were not added, please describe why they were not added and/or why it was difficult to add. --> Signed-off-by: AiChiMomo <1092626063@qq.com>
This commit is contained in:
@@ -254,7 +254,7 @@ vllm serve /weights/DeepSeek-V3.1_w8a8mix_mtp \
|
||||
|
||||
### Prefill-Decode Disaggregation
|
||||
|
||||
There are two ways to deploy `Prefill-Decode Disaggregation`: [Llmdatadist](./multi_node_pd_disaggregation_llmdatadist.md) and [Mooncake](./multi_node_pd_disaggregation_mooncake.md). We recommend use Mooncake for deploy.
|
||||
We recommend using Mooncake for deployment: [Mooncake](./multi_node_pd_disaggregation_mooncake.md).
|
||||
|
||||
Take Atlas 800 A3 (64G × 16) for example, we recommend to deploy 2P1D (4 nodes) rather than 1P1D (2 nodes), because there is no enough NPU memory to serve high concurrency in 1P1D case.
|
||||
- `DeepSeek-V3.1_w8a8mix_mtp 2P1D Layerwise` require 4 Atlas 800 A3 (64G × 16).
|
||||
|
||||
Reference in New Issue
Block a user