[1/N][Refactor] torchair model runner refactor (#2205)
There is lot of torchair code in model runner leading the code hard for
maintenance. We'll create new torchair_model_runner to split torchair
related logic. Following the workflow #2203, this is the first PR.
What this PR does:
create the new torchair model runner, more function will be added later
- vLLM version: v0.10.0
- vLLM main:
586f286789
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
@@ -17,6 +17,7 @@ import torch
|
||||
from vllm.logger import logger
|
||||
|
||||
import vllm_ascend.envs as envs_ascend
|
||||
from vllm_ascend.torchair.torchair_model_runner import NPUTorchairModelRunner
|
||||
from vllm_ascend.torchair.utils import (check_kv_cache_bytes_cache_exist,
|
||||
check_torchair_cache_exist,
|
||||
delete_torchair_cache_file,
|
||||
@@ -52,3 +53,9 @@ class NPUTorchairWorker(NPUWorker):
|
||||
self.model_runner.new_kv_cache_bytes = available_kv_cache_memory
|
||||
|
||||
return available_kv_cache_memory
|
||||
|
||||
def init_device(self):
|
||||
"""Override init_device to init torchair model runner"""
|
||||
device = self._init_device()
|
||||
# Init ModelRunner here, so that we have access to self.device.
|
||||
self.model_runner = NPUTorchairModelRunner(self.vllm_config, device)
|
||||
|
||||
Reference in New Issue
Block a user