[Doc] Sensitive word modification (#8303)

<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

-->
### What this PR does / why we need it?
This PR updates the documentation to replace specific hardware terms
(e.g., HBM, 910B, 310P) with more generic or branded terms (e.g.,
on-chip memory, Atlas inference products) to comply with sensitive word
requirements.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?

---------

Signed-off-by: herizhen <1270637059@qq.com>
Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
This commit is contained in:
herizhen
2026-04-17 16:30:00 +08:00
committed by GitHub
parent 9c1d58f4d2
commit 76cc2204bd
11 changed files with 31 additions and 31 deletions

View File

@@ -48,7 +48,7 @@ docker run --rm \
```
:::{note}
The 310P device is supported from version 0.15.0rc1. You need to select the corresponding image for installation.
The Atlas 300 inference products are supported from version 0.15.0rc1. You need to select the corresponding image for installation.
:::
## Deployment
@@ -57,7 +57,7 @@ The 310P device is supported from version 0.15.0rc1. You need to select the corr
#### Single NPU (PaddleOCR-VL)
PaddleOCR-VL supports single-node single-card deployment on the 910B4 and 310P platform. Follow these steps to start the inference service:
PaddleOCR-VL supports single-node single-card deployment on the 910B4 and Atlas 300 inference products platform. Follow these steps to start the inference service:
1. Prepare model weights: Ensure the downloaded model weights are stored in the `PaddleOCR-VL` directory.
2. Create and execute the deployment script (save as `deploy.sh`):
@@ -90,10 +90,10 @@ vllm serve ${MODEL_PATH} \
```
::::
::::{tab-item} 310P
:sync: 310P
::::{tab-item} Atlas 300 inference products
:sync: Atlas 300 inference products
Run the following script to start the vLLM server on single 310P:
Run the following script to start the vLLM server on single Atlas 300 inference products:
```shell
#!/bin/sh
@@ -112,7 +112,7 @@ vllm serve ${MODEL_PATH} \
```
:::{note}
The `--max_model_len` option is added to prevent errors when generating the attention operator mask on the 310P device.
The `--max_model_len` option is added to prevent errors when generating the attention operator mask on the Atlas 300 inference products.
:::
::::
@@ -260,7 +260,7 @@ The 910B4 device supports inference using the PaddlePaddle framework.
::::{tab-item} OM inference
:sync: om
The 310P device supports only the OM model inference. For details about the process, see the guide provided in [ModelZoo](https://gitcode.com/Ascend/ModelZoo-PyTorch/tree/master/ACL_PyTorch/built-in/ocr/PP-DocLayoutV2).
The Atlas 300 inference products support only the OM model inference. For details about the process, see the guide provided in [ModelZoo](https://gitcode.com/Ascend/ModelZoo-PyTorch/tree/master/ACL_PyTorch/built-in/ocr/PP-DocLayoutV2).
::::
:::::