Change comment location (#4432)
### What this PR does / why we need it? When running 'python example.py',connection issues often occur.The solution is to comment out the first line the code. Complete the specific names of machines A2 and A3. Standardize document format,a space should be added after the colon. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ut - vLLM version: v0.11.2 --------- Signed-off-by: herizhen <you@example.com> Co-authored-by: herizhen <you@example.com>
This commit is contained in:
@@ -261,8 +261,14 @@ for output in outputs:
|
||||
Then run:
|
||||
|
||||
```bash
|
||||
# Try `export VLLM_USE_MODELSCOPE=true` and `pip install modelscope`
|
||||
# to speed up download if huggingface is not reachable.
|
||||
python example.py
|
||||
```
|
||||
|
||||
If you encounter a connection error with Hugging Face (e.g., `We couldn't connect to 'https://huggingface.co' to load the files, and couldn't find them in the cached files.`), run the following commands to use ModelScope as an alternative:
|
||||
|
||||
```bash
|
||||
export VLLM_USE_MODELSCOPE = true
|
||||
pip install modelscope
|
||||
python example.py
|
||||
```
|
||||
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
Refer to [multi_node.md](https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_node.html#verification-process).
|
||||
|
||||
## Run with Docker
|
||||
Assume you have two Atlas 800 A3 (64G*16) or four A2 nodes, and want to deploy the `Kimi-K2-Instruct-W8A8` quantitative model across multiple nodes.
|
||||
Assume you have two Atlas 800 A3 (64G*16) or four A2 nodes, and want to deploy the `Kimi-K2-Instruct-W8A8` quantitative model across multiple nodes.
|
||||
|
||||
```{code-block} bash
|
||||
:substitutions:
|
||||
|
||||
@@ -21,10 +21,10 @@
|
||||
Also, you need to set environment variables to point to them `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib64/python3.11/site-packages/mooncake`, or copy the .so files to the `/usr/local/lib64` directory after compilation
|
||||
|
||||
### KV Pooling Parameter Description
|
||||
**kv_connector_extra_config**:Additional Configurable Parameters for Pooling.
|
||||
**mooncake_rpc_port**:Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration.
|
||||
**load_async**:Whether to Enable Asynchronous Loading. The default value is false.
|
||||
**register_buffer**:Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false.
|
||||
**kv_connector_extra_config**: Additional Configurable Parameters for Pooling.
|
||||
**mooncake_rpc_port**: Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration.
|
||||
**load_async**: Whether to Enable Asynchronous Loading. The default value is false.
|
||||
**register_buffer**: Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false.
|
||||
|
||||
## Run Mooncake Master
|
||||
|
||||
|
||||
Reference in New Issue
Block a user