[Chore] Prevents use of ASCEND_LAUNCH_BLOCKING with ACL Graph (#3574)

### What this PR does / why we need it? Adds a validation check to prevent running with an incompatible configuration. The `ASCEND_LAUNCH_BLOCKING=1` environment variable, used for debugging, enforces synchronous execution which is incompatible with ACL Graph. This change raises an explicit error to inform the user about the conflict and how to resolve it, preventing a more obscure failure later. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? None needed. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
2025-10-21 20:17:33 +08:00
parent 220df60c61
commit ef3fabf399
1 changed files with 11 additions and 0 deletions
--- a/vllm_ascend/platform.py
+++ b/vllm_ascend/platform.py
@@ -270,6 +270,17 @@ class NPUPlatform(Platform):
            compilation_config.cudagraph_mode = CUDAGraphMode.NONE
            compilation_config.level = CompilationLevel.NO_COMPILATION

+        # TODO: Remove this check when ACL Graph supports ASCEND_LAUNCH_BLOCKING=1
+        # Then, we will have to discuss the error handling strategy and user experience
+        if compilation_config.cudagraph_mode != CUDAGraphMode.NONE and \
+            os.environ.get("ASCEND_LAUNCH_BLOCKING", "0") == "1":
+            raise ValueError(
+                "ACL graph is incompatible with ASCEND_LAUNCH_BLOCKING=1. "
+                "Please unset ASCEND_LAUNCH_BLOCKING or set it to 0. If you "
+                "need ASCEND_LAUNCH_BLOCKING for debugging, consider other methods — "
+                "for example, check the plog files (default: $HOME/ascend/log/debug) "
+                "for more information about runtime errors.")
+
        if parallel_config and parallel_config.worker_cls == "auto":
            # TODO: this is a tricky way to disable `use_sequence_parallel_moe` in vllm.
            os.environ["VLLM_ALL2ALL_BACKEND"] = "flashinfer_all2allv"