[feat] support dispatch_v2/combine_v2 hierarchy communication (#7698)

### What this PR does / why we need it?

This PR adds support for hierarchical communication for `dispatch_v2`
and `combine_v2` MoE operations. This is achieved by introducing a new
configuration `enable_mc2_hierarchy_comm`. When enabled, the
communication algorithm is set to "hierarchy", which support mc2 op comm
between two super pod.

The changes include:
- Adding `enable_mc2_hierarchy_comm` to `AscendConfig`.
- Modifying `TokenDispatcherWithMC2` to pass `comm_alg: "hierarchy"` to
the underlying `torch_npu` ops when the new config is enabled.
- Adding validation to ensure that this feature is only used with
compatible PTA/CANN versions and is not used with the conflicting
`fused_mc2` op.
- Updating `is_hierarchical_communication_enabled` to respect the new
configuration flag.

### Does this PR introduce _any_ user-facing change?

Yes, this PR introduces a new user-facing configuration option
`enable_mc2_hierarchy_comm` in `additional_config` to enable
hierarchical communication for MoE.

### How was this patch tested?

- vLLM version: v0.18.0

Signed-off-by: zzzzwwjj <1183291235@qq.com>
This commit is contained in:
zzzzwwjj
2026-03-27 09:20:16 +08:00
committed by GitHub
parent 0bab629f90
commit a40eee2ba1
5 changed files with 30 additions and 3 deletions

View File

@@ -30,6 +30,7 @@ from vllm.platforms import Platform, PlatformEnum
# todo: please remove it when solve cuda hard code in vllm
os.environ["VLLM_DISABLE_SHARED_EXPERTS_STREAM"] = "1"
import vllm_ascend.envs as envs_ascend
from vllm_ascend.ascend_config import init_ascend_config
# isort: off
@@ -511,6 +512,12 @@ class NPUPlatform(Platform):
):
speculative_config.enforce_eager = False
if ascend_config.enable_mc2_hierarchy_comm and envs_ascend.VLLM_ASCEND_ENABLE_FUSED_MC2:
raise ValueError(
"fused mc2 op cannot be used with hierarchy communication."
"Please disable VLLM_ASCEND_ENABLE_FUSED_MC2 by setting it to 0."
)
@classmethod
def import_kernels(cls) -> None:
# Directly importing vllm_ascend_C prevents ASCEND_RT_VISIBLE_DEVICES