[Doc][Misc] Improve readability and fix typos in documentation (#8340)
### What this PR does / why we need it? This PR improves the readability of the documentation by fixing typos, correcting command extensions, and fixing broken links in the Chinese README. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Documentation changes only. --------- Signed-off-by: sunshine202600 <sunshine202600@163.com>
This commit is contained in:
@@ -71,10 +71,10 @@ vllm serve "/xxxxx/DeepSeek-V2-Lite-Chat" \
|
||||
`HCCL_EXEC_TIMEOUT`, `HCCL_CONNECT_TIMEOUT`, and `HCCL_IF_IP` are hccl-related configurations.<br>
|
||||
Set `GLOO_SOCKET_IFNAME`, `TP_SOCKET_IFNAME`, and `HCCL_SOCKET_IFNAME` to the corresponding NIC.<br>
|
||||
`ASCEND_RT_VISIBLE_DEVICES` specifies the cards on which the node run resides. The total number of cards equals `dp_size*tp_size`.<br>
|
||||
`/xxxxx/DeepSeek-V2-Lite-Chat` is configured as a model that requires run.<br>
|
||||
`/xxxxx/DeepSeek-V2-Lite-Chat` is configured as a model that requires running.<br>
|
||||
`--host`: indicates the IP address of the node to be started.<br>
|
||||
`--port`: indicates the port to be started, which corresponds to the port in step 4.<br>
|
||||
`--seed`, --max-model-len, and --max-num-batched-tokens model basic configuration. Set this parameter based on the site requirements.<br>
|
||||
`--port`: indicates the port on which the prefill node will listen (e.g., 8100). This port is later referenced in step 3 when configuring the proxy server.<br>
|
||||
`--seed`: `--max-model-len`, and `--max-num-batched-tokens` are part of the model's basic configuration. Set this parameter based on the site requirements.<br>
|
||||
`--tensor-parallel-size`: specifies the TP size.<br>
|
||||
`--data-parallel-size`: indicates the DP size.<br>
|
||||
`--data-parallel-address`: indicates the IP address of the DP. Set this parameter to the IP address of the node.--data-parallel-rpc-port: indicates the RPC port for communication in the DP group.<br>
|
||||
@@ -144,7 +144,7 @@ python load_balance_proxy_server_example.py --host localhost --prefiller-hosts h
|
||||
|
||||
`--host`: indicates the active node. The value of localhost in the curl command delivered in step 5 must be the same as the host. The default port number for starting the service proxy is 8000.<br>
|
||||
`--prefiller-hosts`: Set this parameter to the IP addresses of all p nodes. In the xpyd scenario, add the IP addresses to the end of this configuration item and leave a blank space between the IP addresses.<br>
|
||||
`--prefiller-ports`: Set this parameter to the port number of all p nodes, which is the configuration of the port number for the vllm to start the service in step 3. Write the port number after the configuration in sequence and leave a blank space between the port number and the port number. The sequence must be one-to-one mapping to the IP address of --prefiller-hosts.<br>
|
||||
`--prefiller-ports`: Set this parameter to the port numbers of all prefill (P) nodes, which were defined in step 1 when starting the prefill nodes. Write the port number after the configuration in sequence and leave a blank space between the port number and the port number. The sequence must be one-to-one mapping to the IP address of --prefiller-hosts.<br>
|
||||
`--decoder-hosts`: Set this parameter to the IP addresses of all d nodes. In the xpyd scenario, add the IP addresses to the end of this configuration item and leave a blank space between the IP addresses.<br>
|
||||
`--decoder-ports`: Set this parameter to the port number of all d nodes, which is the configuration of the port number for the vllm to start the service in step 4. Set port to the end of the configuration, and leave a blank space between port and port. The sequence must be one-to-one mapping to the IP address of --decoder-hosts.<br>
|
||||
|
||||
|
||||
Reference in New Issue
Block a user