[Doc] Update doc (#3836)
### What this PR does / why we need it? Update doc ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.1 Signed-off-by: hfadzxy <starmoon_zhang@163.com>
This commit is contained in:
@@ -28,7 +28,7 @@ docker run --rm \
|
||||
-it $IMAGE bash
|
||||
```
|
||||
|
||||
Setup environment variables:
|
||||
Set up environment variables:
|
||||
|
||||
```bash
|
||||
# Set `max_split_size_mb` to reduce memory fragmentation and avoid out of memory
|
||||
@@ -44,7 +44,7 @@ git clone https://gitcode.com/ascend-tribe/pangu-pro-moe-model.git
|
||||
|
||||
### Online Inference on Multi-NPU
|
||||
|
||||
Run the following script to start the vLLM server on Multi-NPU:
|
||||
Run the following script to start the vLLM server on multi-NPU:
|
||||
|
||||
```bash
|
||||
vllm serve /path/to/pangu-pro-moe-model \
|
||||
|
||||
Reference in New Issue
Block a user