[bugfix] some bugs maybe fail to run (#896)

### What this PR does / why we need it?
Solve the bug that the graph mode is the same as p and d, and some other
bugs.
### Does this PR introduce _any_ user-facing change?
Wouldn't be
### How was this patch tested?
Follow the end-to-end test

Signed-off-by: ningbenzhe1 <ningbenzhe@huawei.com>
This commit is contained in:
NINGBENZHE
2025-06-03 11:07:33 +08:00
committed by GitHub
parent 92bc5576d8
commit 6ec64a3f96
7 changed files with 15 additions and 7 deletions

View File

@@ -66,8 +66,7 @@ def fused_experts_with_mc2(
local_rank = torch.distributed.get_rank(group=ep_group)
all_to_all_group_size = torch.distributed.get_world_size(ep_group)
world_szie = torch.distributed.get_world_size()
tp_size = world_szie // all_to_all_group_size
tp_size = get_etp_group().world_size
tp_rank = rank % tp_size
stage1_kwargs = {