Files
xc-llm-ascend/requirements.txt
Frank Chen 3da2ba22eb [Platform] Enable ARM-only CPU binding with NUMA-balanced A3 policy and update docs/tests (#6686)
### What this PR does / why we need it?

- Keeps enable_cpu_binding default on, but skips binding on non‑ARM CPUs
inside bind_cpus, with a clear log.
- Uses a table-driven binding policy: A3 uses NUMA‑balanced binding;
other device types use NUMA‑affinity binding.
- Updates docs to reflect the exact behavior and adds/updates unit tests
for the new logic.

### Does this PR introduce _any_ user-facing change?

- Yes. CPU binding is now enabled by default via additional_config, and
documented in the user guide.
- CPU binding behavior differs by device type (A3 vs. others).

### How was this patch tested?

Added/updated unit tests:

test_cpu_binding.py
1.   test_binding_mode_table covers A2 vs A3 binding mode mapping.
2. test_build_cpu_pools_fallback_to_numa_balanced covers fallback when
affinity info is missing.
3. TestBindingSwitch.test_is_arm_cpu covers ARM/x86/unknown arch
detection.
4.   test_bind_cpus_skip_non_arm covers non‑ARM skip path in bind_cpus.

test_worker_v1.py
1. Updated mocks for enable_cpu_binding default True to align with new
config default.

- vLLM version: v0.14.1
- vLLM main: d7de043

---------

Signed-off-by: chenchuw886 <chenchuw@huawei.com>
Co-authored-by: chenchuw886 <chenchuw@huawei.com>
2026-02-25 11:15:14 +08:00

40 lines
652 B
Plaintext

# Should be mirrored in pyporject.toml
cmake>=3.26
decorator
einops
numpy<2.0.0
packaging
pip
pybind11
pyyaml
scipy
pandas
psutil
setuptools>=64
setuptools-scm>=8
torch==2.9.0
torchvision
torchaudio
wheel
xgrammar>=0.1.30
pandas-stubs
opencv-python-headless<=4.11.0.86 # Required to avoid numpy version conflict with vllm
compressed_tensors>=0.11.0
# requirements for disaggregated prefill
msgpack
quart
# Required for N-gram speculative decoding
numba
# Install torch_npu
#--pre
#--extra-index-url https://mirrors.huaweicloud.com/ascend/repos/pypi
torch-npu==2.9.0
arctic-inference==0.1.1
transformers>=4.57.4
fastapi<0.124.0
triton-ascend==3.2.0