[Doc][Misc] Improve readability and fix typos in documentation (#8340)

### What this PR does / why we need it? This PR improves the readability of the documentation by fixing typos, correcting command extensions, and fixing broken links in the Chinese README. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Documentation changes only. --------- Signed-off-by: sunshine202600 <sunshine202600@163.com>
2026-04-17 08:54:38 +08:00
parent 8952fddc7e
commit 1dd1de8153
46 changed files with 90 additions and 92 deletions
--- a/examples/disaggregated_prefill_v1/mooncake_connector_deployment_guide.md
+++ b/examples/disaggregated_prefill_v1/mooncake_connector_deployment_guide.md
@@ -71,10 +71,10 @@ vllm serve "/xxxxx/DeepSeek-V2-Lite-Chat" \
 `HCCL_EXEC_TIMEOUT`, `HCCL_CONNECT_TIMEOUT`, and `HCCL_IF_IP` are hccl-related configurations.<br>
 Set `GLOO_SOCKET_IFNAME`, `TP_SOCKET_IFNAME`, and `HCCL_SOCKET_IFNAME` to the corresponding NIC.<br>
 `ASCEND_RT_VISIBLE_DEVICES` specifies the cards on which the node run resides. The total number of cards equals `dp_size*tp_size`.<br>
-`/xxxxx/DeepSeek-V2-Lite-Chat` is configured as a model that requires run.<br>
+`/xxxxx/DeepSeek-V2-Lite-Chat` is configured as a model that requires running.<br>
 `--host`: indicates the IP address of the node to be started.<br>
-`--port`: indicates the port to be started, which corresponds to the port in step 4.<br>
-`--seed`, --max-model-len, and --max-num-batched-tokens model basic configuration. Set this parameter based on the site requirements.<br>
+`--port`: indicates the port on which the prefill node will listen (e.g., 8100). This port is later referenced in step 3 when configuring the proxy server.<br>
+`--seed`: `--max-model-len`, and `--max-num-batched-tokens` are part of the model's basic configuration. Set this parameter based on the site requirements.<br>
 `--tensor-parallel-size`: specifies the TP size.<br>
 `--data-parallel-size`: indicates the DP size.<br>
 `--data-parallel-address`: indicates the IP address of the DP. Set this parameter to the IP address of the node.--data-parallel-rpc-port: indicates the RPC port for communication in the DP group.<br>
@@ -144,7 +144,7 @@ python load_balance_proxy_server_example.py --host localhost --prefiller-hosts h

 `--host`: indicates the active node. The value of localhost in the curl command delivered in step 5 must be the same as the host. The default port number for starting the service proxy is 8000.<br>
 `--prefiller-hosts`: Set this parameter to the IP addresses of all p nodes. In the xpyd scenario, add the IP addresses to the end of this configuration item and leave a blank space between the IP addresses.<br>
-`--prefiller-ports`: Set this parameter to the port number of all p nodes, which is the configuration of the port number for the vllm to start the service in step 3. Write the port number after the configuration in sequence and leave a blank space between the port number and the port number. The sequence must be one-to-one mapping to the IP address of --prefiller-hosts.<br>
+`--prefiller-ports`: Set this parameter to the port numbers of all prefill (P) nodes, which were defined in step 1 when starting the prefill nodes. Write the port number after the configuration in sequence and leave a blank space between the port number and the port number. The sequence must be one-to-one mapping to the IP address of --prefiller-hosts.<br>
 `--decoder-hosts`: Set this parameter to the IP addresses of all d nodes. In the xpyd scenario, add the IP addresses to the end of this configuration item and leave a blank space between the IP addresses.<br>
 `--decoder-ports`: Set this parameter to the port number of all d nodes, which is the configuration of the port number for the vllm to start the service in step 4. Set port to the end of the configuration, and leave a blank space between port and port. The sequence must be one-to-one mapping to the IP address of --decoder-hosts.<br>