[DOC] Qwen3 PD disaggregation user guide (#2751)
### What this PR does / why we need it?
The PR is for the document of the prefiller&decoder disaggregation
deloyment guide.
The scenario of the guide is:
- Use 3 nodes totally and 2 NPUs on each node
- Qwen3-30B-A3B
- 1P2D
- Expert Parallel
The deployment can be used to verify PD Disggregation / Expert Parallel
features with a slightly less resources.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
- vLLM version: v0.10.1.1
- vLLM main:
e599e2c65e
---------
Signed-off-by: paulyu12 <507435917@qq.com>
This commit is contained in:
@@ -15,4 +15,5 @@ multi_npu_quantization
|
||||
single_node_300i
|
||||
multi_node
|
||||
multi_node_kimi
|
||||
multi_node_pd_disaggregation
|
||||
:::
|
||||
|
||||
Reference in New Issue
Block a user