[1/N][Refactor] torchair model runner refactor (#2205)

There is lot of torchair code in model runner leading the code hard for maintenance. We'll create new torchair_model_runner to split torchair related logic. Following the workflow #2203, this is the first PR. What this PR does: create the new torchair model runner, more function will be added later - vLLM version: v0.10.0 - vLLM main: 586f286789 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-08-05 18:43:04 +08:00
parent 458ab2db12
commit 292fb8f696
4 changed files with 50 additions and 9 deletions
--- a/vllm_ascend/torchair/torchair_worker.py
+++ b/vllm_ascend/torchair/torchair_worker.py
@@ -17,6 +17,7 @@ import torch
 from vllm.logger import logger

 import vllm_ascend.envs as envs_ascend
+from vllm_ascend.torchair.torchair_model_runner import NPUTorchairModelRunner
 from vllm_ascend.torchair.utils import (check_kv_cache_bytes_cache_exist,
                                        check_torchair_cache_exist,
                                        delete_torchair_cache_file,
@@ -52,3 +53,9 @@ class NPUTorchairWorker(NPUWorker):
        self.model_runner.new_kv_cache_bytes = available_kv_cache_memory

        return available_kv_cache_memory
+
+    def init_device(self):
+        """Override init_device to init torchair model runner"""
+        device = self._init_device()
+        # Init ModelRunner here, so that we have access to self.device.
+        self.model_runner = NPUTorchairModelRunner(self.vllm_config, device)