[Refactor] Modify the binding logic, added memory migration and interrupt core binding functions. (#6785)

[Refactor] Modify the binding logic, added memory migration and
interrupt core binding functions.

### What this PR does / why we need it?
Controls the use of memory on a closer NUMA node to achieve a lower
memory access latency, while binding interrupts to different CPU cores
to prevent them form interrupting the inference process.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?


b8eaaa073b

Signed-off-by: rowzwel_dx <1392851715@qq.com>

Signed-off-by: Rozwel-dx <1392851715@qq.com>
- vLLM version: v0.15.0
- vLLM main:
9562912cea

Signed-off-by: Rozwel-dx <1392851715@qq.com>
This commit is contained in:
Rozwel-dx
2026-02-26 08:49:50 +08:00
committed by GitHub
parent 3a4292e5b7
commit a9cca0c5c4
2 changed files with 86 additions and 5 deletions

View File

@@ -162,11 +162,11 @@ class TestCpuAlloc(unittest.TestCase):
@patch('vllm_ascend.cpu_binding.execute_command')
def test_allocate(self, mock_execute_command):
self.cpu_alloc.device_info.running_npu_list = [0]
self.cpu_alloc.npu_cpu_pool = {0: [0, 1, 2]}
self.cpu_alloc.npu_cpu_pool = {0: [0, 1, 2, 3, 4]}
self.cpu_alloc.allocate()
self.assertEqual(self.cpu_alloc.assign_main[0], [0])
self.assertEqual(self.cpu_alloc.assign_acl[0], [1])
self.assertEqual(self.cpu_alloc.assign_rel[0], [2])
self.assertEqual(self.cpu_alloc.assign_main[0], [2])
self.assertEqual(self.cpu_alloc.assign_acl[0], [3])
self.assertEqual(self.cpu_alloc.assign_rel[0], [4])
self.cpu_alloc.npu_cpu_pool = {0: [0, 1]}
with self.assertRaises(RuntimeError):
self.cpu_alloc.allocate()