[Chore] Prevents use of ASCEND_LAUNCH_BLOCKING with ACL Graph (#3574)
### What this PR does / why we need it? Adds a validation check to prevent running with an incompatible configuration. The `ASCEND_LAUNCH_BLOCKING=1` environment variable, used for debugging, enforces synchronous execution which is incompatible with ACL Graph. This change raises an explicit error to inform the user about the conflict and how to resolve it, preventing a more obscure failure later. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? None needed. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
This commit is contained in:
@@ -270,6 +270,17 @@ class NPUPlatform(Platform):
|
||||
compilation_config.cudagraph_mode = CUDAGraphMode.NONE
|
||||
compilation_config.level = CompilationLevel.NO_COMPILATION
|
||||
|
||||
# TODO: Remove this check when ACL Graph supports ASCEND_LAUNCH_BLOCKING=1
|
||||
# Then, we will have to discuss the error handling strategy and user experience
|
||||
if compilation_config.cudagraph_mode != CUDAGraphMode.NONE and \
|
||||
os.environ.get("ASCEND_LAUNCH_BLOCKING", "0") == "1":
|
||||
raise ValueError(
|
||||
"ACL graph is incompatible with ASCEND_LAUNCH_BLOCKING=1. "
|
||||
"Please unset ASCEND_LAUNCH_BLOCKING or set it to 0. If you "
|
||||
"need ASCEND_LAUNCH_BLOCKING for debugging, consider other methods — "
|
||||
"for example, check the plog files (default: $HOME/ascend/log/debug) "
|
||||
"for more information about runtime errors.")
|
||||
|
||||
if parallel_config and parallel_config.worker_cls == "auto":
|
||||
# TODO: this is a tricky way to disable `use_sequence_parallel_moe` in vllm.
|
||||
os.environ["VLLM_ALL2ALL_BACKEND"] = "flashinfer_all2allv"
|
||||
|
||||
Reference in New Issue
Block a user