From 3740b3edfc8aa0b23f38a9eddcf6c55f9718e4df Mon Sep 17 00:00:00 2001 From: mazhixin000 <76465098+mazhixin000@users.noreply.github.com> Date: Fri, 5 Dec 2025 18:35:18 +0800 Subject: [PATCH] =?UTF-8?q?=E3=80=90main=E3=80=91[Doc]add=202P1D=20instruc?= =?UTF-8?q?tion=20for=20single=20node=20(#4716)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ### What this PR does / why we need it? Add the description for 2P1D, keeping it consistent with the content in the dev branch. ### Does this PR introduce _any_ user-facing change? no - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.12.0 Signed-off-by: mazhixin000 Co-authored-by: wangxiyuan --- .../single_node_pd_disaggregation_llmdatadist.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/source/tutorials/single_node_pd_disaggregation_llmdatadist.md b/docs/source/tutorials/single_node_pd_disaggregation_llmdatadist.md index c2cf93a9..db1834e2 100644 --- a/docs/source/tutorials/single_node_pd_disaggregation_llmdatadist.md +++ b/docs/source/tutorials/single_node_pd_disaggregation_llmdatadist.md @@ -45,7 +45,7 @@ bash gen_ranktable.sh --ips 192.0.0.1 \ --npus-per-node 2 --network-card-name eth0 --prefill-device-cnt 1 --decode-device-cnt 1 ``` -The rank table will be generated at /vllm-workspace/vllm-ascend/examples/disaggregate_prefill_v1/ranktable.json +If you want to run "2P1D", please set npus-per-node to 3 and prefill-device-cnt to 2. The rank table will be generated at /vllm-workspace/vllm-ascend/examples/disaggregate_prefill_v1/ranktable.json |Parameter | Meaning | | --- | --- | @@ -137,6 +137,8 @@ vllm serve /model/Qwen2.5-VL-7B-Instruct \ ::::: +If you want to run "2P1D", please set ASCEND_RT_VISIBLE_DEVICES, VLLM_ASCEND_LLMDD_RPC_PORT and port to different values for each P process. + ## Example Proxy for Deployment Run a proxy server on the same node with the prefiller service instance. You can get the proxy program in the repository's examples: [load\_balance\_proxy\_server\_example.py](https://github.com/vllm-project/vllm-ascend/blob/main/examples/disaggregated_prefill_v1/load_balance_proxy_server_example.py) @@ -151,6 +153,12 @@ python load_balance_proxy_server_example.py \ --decoder-ports 13701 ``` +|Parameter | Meaning | +| --- | --- | +| --port | Port of proxy | +| --prefiller-port | All ports of prefill | +| --decoder-ports | All ports of decoder | + ## Verification Check service health using the proxy server endpoint.