From c08364f761d756f7d37ba42b19b01d199e85001e Mon Sep 17 00:00:00 2001 From: meihanc Date: Mon, 2 Feb 2026 17:31:21 +0800 Subject: [PATCH] [Bugfix] Fix intermittent kv_port conflict with AscendDirectTransport (#6455) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ### What this PR does / why we need it? When using Mooncake on Ascend NPU, AscendDirectTransport randomly allocates ports within range `[20000, 20000 + npu_per_node × 1000)`. Reference: [ascend_direct_transport.cpp#L554](https://github.com/kvcache-ai/Mooncake/blob/v0.3.7.post2/mooncake-transfer-engine/src/transport/ascend_transport/ascend_direct_transport/ascend_direct_transport.cpp#L475) If `kv_port` overlaps with this range, users may encounter intermittent startup failures: ```bash zmq.error.ZMQError: Address already in use (addr='tcp://x.x.x.x:30012') RuntimeError: KV Cache sending/receiving thread failed to start. ``` This pr fix intermittent kv_port conflict with AscendDirectTransport in `Qwen3-235B-W8A8-EPLB.yaml`, and add Added `kv_port Configuration Guide` section in `pd_disaggregation_mooncake_multi_node.md`. test Results(tests/e2e/nightly/multi_node/config/Qwen3-235B-W8A8-EPLB.yaml): https://github.com/vllm-project/vllm-ascend/actions/runs/21540138907/job/62073265259 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.14.1 - vLLM main: https://github.com/vllm-project/vllm/commit/dc917cceb877dfd13f98c538c4c96158047d98bd Signed-off-by: Meihan-chen --- .../pd_disaggregation_mooncake_multi_node.md | 29 ++++++++++++++----- .../config/Qwen3-235B-W8A8-EPLB.yaml | 4 +-- 2 files changed, 23 insertions(+), 10 deletions(-) diff --git a/docs/source/tutorials/pd_disaggregation_mooncake_multi_node.md b/docs/source/tutorials/pd_disaggregation_mooncake_multi_node.md index 797df320..94d4de47 100644 --- a/docs/source/tutorials/pd_disaggregation_mooncake_multi_node.md +++ b/docs/source/tutorials/pd_disaggregation_mooncake_multi_node.md @@ -224,6 +224,19 @@ export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/mooncake:$LD_LI We can run the following scripts to launch a server on the prefiller/decoder node, respectively. Please note that each P/D node will occupy ports ranging from kv_port to kv_port + num_chips to initialize socket listeners. To avoid any issues, port conflicts should be prevented. Additionally, ensure that each node's engine_id is uniquely assigned to avoid conflicts. +### kv_port Configuration Guide + +On Ascend NPU, Mooncake uses AscendDirectTransport for RDMA data transfer, which randomly allocates ports within range `[20000, 20000 + npu_per_node × 1000)`. If `kv_port` overlaps with this range, intermittent port conflicts may occur. To avoid this, configure `kv_port` according to the table below: + +| NPUs per Node | Reserved Port Range | Recommended kv_port | +|---------------|---------------------|---------------------| +| 8 | 20000 - 27999 | >= 28000 | +| 16 | 20000 - 35999 | >= 36000 | + +```{warning} +If you occasionally see `zmq.error.ZMQError: Address already in use` during startup, it may be caused by kv_port conflicting with randomly allocated AscendDirectTransport ports. Increase your kv_port value to avoid the reserved range. +``` + ### launch_online_dp.py Use `launch_online_dp.py` to launch external dp vllm servers. @@ -281,7 +294,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeLayerwiseConnector", "kv_role": "kv_producer", - "kv_port": "30000", + "kv_port": "36000", "engine_id": "0", "kv_connector_extra_config": { "prefill": { @@ -340,7 +353,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeLayerwiseConnector", "kv_role": "kv_producer", - "kv_port": "30100", + "kv_port": "36100", "engine_id": "1", "kv_connector_extra_config": { "prefill": { @@ -399,7 +412,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeLayerwiseConnector", "kv_role": "kv_consumer", - "kv_port": "30200", + "kv_port": "36200", "engine_id": "2", "kv_connector_extra_config": { "prefill": { @@ -457,7 +470,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeLayerwiseConnector", "kv_role": "kv_consumer", - "kv_port": "30200", + "kv_port": "36200", "engine_id": "2", "kv_connector_extra_config": { @@ -524,7 +537,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeConnectorV1", "kv_role": "kv_producer", - "kv_port": "30000", + "kv_port": "36000", "engine_id": "0", "kv_connector_extra_config": { "prefill": { @@ -583,7 +596,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeConnectorV1", "kv_role": "kv_producer", - "kv_port": "30100", + "kv_port": "36100", "engine_id": "1", "kv_connector_extra_config": { "prefill": { @@ -642,7 +655,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeConnectorV1", "kv_role": "kv_consumer", - "kv_port": "30200", + "kv_port": "36200", "engine_id": "2", "kv_connector_extra_config": { "prefill": { @@ -700,7 +713,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \ --kv-transfer-config \ '{"kv_connector": "MooncakeConnectorV1", "kv_role": "kv_consumer", - "kv_port": "30200", + "kv_port": "36200", "engine_id": "2", "kv_connector_extra_config": { "prefill": { diff --git a/tests/e2e/nightly/multi_node/config/Qwen3-235B-W8A8-EPLB.yaml b/tests/e2e/nightly/multi_node/config/Qwen3-235B-W8A8-EPLB.yaml index b21d8b2c..adf90bb9 100644 --- a/tests/e2e/nightly/multi_node/config/Qwen3-235B-W8A8-EPLB.yaml +++ b/tests/e2e/nightly/multi_node/config/Qwen3-235B-W8A8-EPLB.yaml @@ -37,7 +37,7 @@ deployment: --kv-transfer-config '{"kv_connector": "MooncakeConnectorV1", "kv_role": "kv_producer", - "kv_port": "30000", + "kv_port": "36000", "engine_id": "0", "kv_connector_extra_config": { "prefill": { @@ -73,7 +73,7 @@ deployment: --kv-transfer-config '{"kv_connector": "MooncakeConnectorV1", "kv_role": "kv_consumer", - "kv_port": "30200", + "kv_port": "36100", "engine_id": "1", "kv_connector_extra_config": { "prefill": {