[BugFix] Fix ascend config check (#1092)

Fix the ascend config check logic: 1. refactor check_ascend_config to make it clear: 1. torchair graph should not work with enforce_eager=True 2. aclgraph should not work with torchair graph 3. add refresh config for rlhf case 4. fix a typo in model runner 5. change expert_tensor_parallel_size default to 0 to keep the same as before Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-06-06 18:54:37 +08:00
parent 973f993a13
commit dab19d5dca
5 changed files with 136 additions and 42 deletions
--- a/vllm_ascend/worker/model_runner_v1.py
+++ b/vllm_ascend/worker/model_runner_v1.py
@@ -323,7 +323,7 @@ class NPUModelRunner(LoRAModelRunnerMixin):

        ascend_config = get_ascend_config()
        self.torchair_graph_enabled = ascend_config.torchair_graph_config.enabled and self.vllm_config.model_config.use_mla
-        self.torchair_graph_use_cached_npu_graph = ascend_config.torchair_graph_config.use_cached_graph
+        self.use_cached_npu_graph = ascend_config.torchair_graph_config.use_cached_graph
        self.torchair_graph_batch_sizes = ascend_config.torchair_graph_config.graph_batch_sizes

        if ascend_config.torchair_graph_config.graph_batch_sizes_init: