diff --git a/docs/source/installation.md b/docs/source/installation.md index 9cde789c..757752be 100644 --- a/docs/source/installation.md +++ b/docs/source/installation.md @@ -261,8 +261,14 @@ for output in outputs: Then run: ```bash -# Try `export VLLM_USE_MODELSCOPE=true` and `pip install modelscope` -# to speed up download if huggingface is not reachable. +python example.py +``` + +If you encounter a connection error with Hugging Face (e.g., `We couldn't connect to 'https://huggingface.co' to load the files, and couldn't find them in the cached files.`), run the following commands to use ModelScope as an alternative: + +```bash +export VLLM_USE_MODELSCOPE = true +pip install modelscope python example.py ``` diff --git a/docs/source/tutorials/multi_node_kimi.md b/docs/source/tutorials/multi_node_kimi.md index 59bfa8f6..cb28bca9 100644 --- a/docs/source/tutorials/multi_node_kimi.md +++ b/docs/source/tutorials/multi_node_kimi.md @@ -5,7 +5,7 @@ Refer to [multi_node.md](https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_node.html#verification-process). ## Run with Docker -Assume you have two Atlas 800 A3 (64G*16) or four A2 nodes, and want to deploy the `Kimi-K2-Instruct-W8A8` quantitative model across multiple nodes. +Assume you have two Atlas 800 A3 (64G*16) or four A2 nodes, and want to deploy the `Kimi-K2-Instruct-W8A8` quantitative model across multiple nodes. ```{code-block} bash :substitutions: diff --git a/docs/source/user_guide/feature_guide/kv_pool_mooncake.md b/docs/source/user_guide/feature_guide/kv_pool_mooncake.md index fcab156f..9188d7d1 100644 --- a/docs/source/user_guide/feature_guide/kv_pool_mooncake.md +++ b/docs/source/user_guide/feature_guide/kv_pool_mooncake.md @@ -21,10 +21,10 @@ Also, you need to set environment variables to point to them `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib64/python3.11/site-packages/mooncake`, or copy the .so files to the `/usr/local/lib64` directory after compilation ### KV Pooling Parameter Description -**kv_connector_extra_config**:Additional Configurable Parameters for Pooling. -**mooncake_rpc_port**:Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration. -**load_async**:Whether to Enable Asynchronous Loading. The default value is false. -**register_buffer**:Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false. +**kv_connector_extra_config**: Additional Configurable Parameters for Pooling. +**mooncake_rpc_port**: Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration. +**load_async**: Whether to Enable Asynchronous Loading. The default value is false. +**register_buffer**: Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false. ## Run Mooncake Master