[Doc] Sensitive word modification (#8303)
<!-- Thanks for sending a pull request! BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing/overview.html --> ### What this PR does / why we need it? This PR updates the documentation to replace specific hardware terms (e.g., HBM, 910B, 310P) with more generic or branded terms (e.g., on-chip memory, Atlas inference products) to comply with sensitive word requirements. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
This commit is contained in:
@@ -36,7 +36,7 @@ On multi‑socket ARM systems, the OS scheduler may place vLLM threads on CPUs f
|
||||
- Read cpuset from /proc/self/status.
|
||||
- Read topo affinity from `npu‑smi info -t topo`.
|
||||
4. **Build CPU pools**:
|
||||
- Use **global_slice** for A3 devices; **topo_affinity** for A2 and 310P.
|
||||
- Use **global_slice** for A3 devices; **topo_affinity** for A2 and Atlas 300 inference products.
|
||||
- If topo affinity is missing, fall back to global_slice.
|
||||
- Ensure each NPU has at least 5 CPUs.
|
||||
5. **Allocate per‑role CPUs**:
|
||||
|
||||
Reference in New Issue
Block a user