[1/N][Refactor] torchair model runner refactor (#2205)

There is lot of torchair code in model runner leading the code hard for
maintenance. We'll create new torchair_model_runner to split torchair
related logic. Following the workflow #2203, this is the first PR.

What this PR does:

create the new torchair model runner, more function will be added later


- vLLM version: v0.10.0
- vLLM main:
586f286789

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
wangxiyuan
2025-08-05 18:43:04 +08:00
committed by GitHub
parent 458ab2db12
commit 292fb8f696
4 changed files with 50 additions and 9 deletions

View File

@@ -130,17 +130,19 @@ class NPUWorker(WorkerBase):
self.cache_config.num_gpu_blocks = num_gpu_blocks
self.cache_config.num_cpu_blocks = num_cpu_blocks
def init_device(self):
def _init_device(self):
device = torch.device(f"npu:{self.local_rank}")
NPUPlatform.set_device(device)
NPUPlatform.empty_cache()
self.init_npu_memory = NPUPlatform.mem_get_info()[0]
# Initialize the distributed environment.
self._init_worker_distributed_environment()
# Set random seed.
NPUPlatform.seed_everything(self.model_config.seed)
return device
def init_device(self):
device = self._init_device()
# Init ModelRunner here, so that we have access to self.device.
self.model_runner = NPUModelRunner(self.vllm_config, device)