[Platform] Enable ARM-only CPU binding with NUMA-balanced A3 policy and update docs/tests (#6686)
### What this PR does / why we need it? - Keeps enable_cpu_binding default on, but skips binding on non‑ARM CPUs inside bind_cpus, with a clear log. - Uses a table-driven binding policy: A3 uses NUMA‑balanced binding; other device types use NUMA‑affinity binding. - Updates docs to reflect the exact behavior and adds/updates unit tests for the new logic. ### Does this PR introduce _any_ user-facing change? - Yes. CPU binding is now enabled by default via additional_config, and documented in the user guide. - CPU binding behavior differs by device type (A3 vs. others). ### How was this patch tested? Added/updated unit tests: test_cpu_binding.py 1. test_binding_mode_table covers A2 vs A3 binding mode mapping. 2. test_build_cpu_pools_fallback_to_numa_balanced covers fallback when affinity info is missing. 3. TestBindingSwitch.test_is_arm_cpu covers ARM/x86/unknown arch detection. 4. test_bind_cpus_skip_non_arm covers non‑ARM skip path in bind_cpus. test_worker_v1.py 1. Updated mocks for enable_cpu_binding default True to align with new config default. - vLLM version: v0.14.1 - vLLM main: d7de043 --------- Signed-off-by: chenchuw886 <chenchuw@huawei.com> Co-authored-by: chenchuw886 <chenchuw@huawei.com>
This commit is contained in:
@@ -70,7 +70,7 @@ class TestNPUWorker(TestBase):
|
||||
# Setup mock behavior
|
||||
mock_ops.register_dummy_fusion_op.return_value = None
|
||||
mock_ascend_config = MagicMock()
|
||||
mock_ascend_config.enable_cpu_binding = False
|
||||
mock_ascend_config.enable_cpu_binding = True
|
||||
mock_get_ascend_config.return_value = mock_ascend_config
|
||||
|
||||
# Import and create NPUWorker instance
|
||||
@@ -125,7 +125,7 @@ class TestNPUWorker(TestBase):
|
||||
self.model_config_mock.trust_remote_code = True
|
||||
mock_ops.register_dummy_fusion_op.return_value = None
|
||||
mock_ascend_config = MagicMock()
|
||||
mock_ascend_config.enable_cpu_binding = False
|
||||
mock_ascend_config.enable_cpu_binding = True
|
||||
mock_get_ascend_config.return_value = mock_ascend_config
|
||||
|
||||
# Create NPUWorker instance
|
||||
@@ -168,7 +168,7 @@ class TestNPUWorker(TestBase):
|
||||
self.cache_config_mock.cache_dtype = "float32"
|
||||
mock_ops.register_dummy_fusion_op.return_value = None
|
||||
mock_ascend_config = MagicMock()
|
||||
mock_ascend_config.enable_cpu_binding = False
|
||||
mock_ascend_config.enable_cpu_binding = True
|
||||
mock_get_ascend_config.return_value = mock_ascend_config
|
||||
|
||||
# Create NPUWorker instance
|
||||
|
||||
Reference in New Issue
Block a user