[Doc] Sensitive word modification (#8303)

<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

-->
### What this PR does / why we need it?
This PR updates the documentation to replace specific hardware terms
(e.g., HBM, 910B, 310P) with more generic or branded terms (e.g.,
on-chip memory, Atlas inference products) to comply with sensitive word
requirements.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?

---------

Signed-off-by: herizhen <1270637059@qq.com>
Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
This commit is contained in:
herizhen
2026-04-17 16:30:00 +08:00
committed by GitHub
parent 9c1d58f4d2
commit 76cc2204bd
11 changed files with 31 additions and 31 deletions

View File

@@ -36,7 +36,7 @@ On multisocket ARM systems, the OS scheduler may place vLLM threads on CPUs f
- Read cpuset from /proc/self/status.
- Read topo affinity from `npusmi info -t topo`.
4. **Build CPU pools**:
- Use **global_slice** for A3 devices; **topo_affinity** for A2 and 310P.
- Use **global_slice** for A3 devices; **topo_affinity** for A2 and Atlas 300 inference products.
- If topo affinity is missing, fall back to global_slice.
- Ensure each NPU has at least 5 CPUs.
5. **Allocate perrole CPUs**: