[Doc] Sensitive word modification (#8303)
<!-- Thanks for sending a pull request! BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing/overview.html --> ### What this PR does / why we need it? This PR updates the documentation to replace specific hardware terms (e.g., HBM, 910B, 310P) with more generic or branded terms (e.g., on-chip memory, Atlas inference products) to comply with sensitive word requirements. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
This commit is contained in:
@@ -48,7 +48,7 @@ docker run --rm \
|
||||
```
|
||||
|
||||
:::{note}
|
||||
The 310P device is supported from version 0.15.0rc1. You need to select the corresponding image for installation.
|
||||
The Atlas 300 inference products are supported from version 0.15.0rc1. You need to select the corresponding image for installation.
|
||||
:::
|
||||
|
||||
## Deployment
|
||||
@@ -57,7 +57,7 @@ The 310P device is supported from version 0.15.0rc1. You need to select the corr
|
||||
|
||||
#### Single NPU (PaddleOCR-VL)
|
||||
|
||||
PaddleOCR-VL supports single-node single-card deployment on the 910B4 and 310P platform. Follow these steps to start the inference service:
|
||||
PaddleOCR-VL supports single-node single-card deployment on the 910B4 and Atlas 300 inference products platform. Follow these steps to start the inference service:
|
||||
|
||||
1. Prepare model weights: Ensure the downloaded model weights are stored in the `PaddleOCR-VL` directory.
|
||||
2. Create and execute the deployment script (save as `deploy.sh`):
|
||||
@@ -90,10 +90,10 @@ vllm serve ${MODEL_PATH} \
|
||||
```
|
||||
|
||||
::::
|
||||
::::{tab-item} 310P
|
||||
:sync: 310P
|
||||
::::{tab-item} Atlas 300 inference products
|
||||
:sync: Atlas 300 inference products
|
||||
|
||||
Run the following script to start the vLLM server on single 310P:
|
||||
Run the following script to start the vLLM server on single Atlas 300 inference products:
|
||||
|
||||
```shell
|
||||
#!/bin/sh
|
||||
@@ -112,7 +112,7 @@ vllm serve ${MODEL_PATH} \
|
||||
```
|
||||
|
||||
:::{note}
|
||||
The `--max_model_len` option is added to prevent errors when generating the attention operator mask on the 310P device.
|
||||
The `--max_model_len` option is added to prevent errors when generating the attention operator mask on the Atlas 300 inference products.
|
||||
:::
|
||||
|
||||
::::
|
||||
@@ -260,7 +260,7 @@ The 910B4 device supports inference using the PaddlePaddle framework.
|
||||
::::{tab-item} OM inference
|
||||
:sync: om
|
||||
|
||||
The 310P device supports only the OM model inference. For details about the process, see the guide provided in [ModelZoo](https://gitcode.com/Ascend/ModelZoo-PyTorch/tree/master/ACL_PyTorch/built-in/ocr/PP-DocLayoutV2).
|
||||
The Atlas 300 inference products support only the OM model inference. For details about the process, see the guide provided in [ModelZoo](https://gitcode.com/Ascend/ModelZoo-PyTorch/tree/master/ACL_PyTorch/built-in/ocr/PP-DocLayoutV2).
|
||||
|
||||
::::
|
||||
:::::
|
||||
|
||||
Reference in New Issue
Block a user