xc-llm-ascend

Files

LI SHENGYONG bdc721d35a [smoke][bugfix] moe_init_routing_v2 active_expert_range use int type (#5521 )

### What this PR does / why we need it?
The float kernel of MOE_init_routing_v2 in the dispatch allgather
operation does not support tensor format for active_expert_range; it
only supports int.
PR5311 To unify the variables `local_num_experts` and
`self.local_num_experts`, `self.local_num_experts` was used
consistently, which led to the subsequent integer type parameter being
converted to a tensor type.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
gsm8k | exact_match,strict-match: ground_truth=0.89 | measured=0.8939 |
success=✅
gsm8k | exact_match,flexible-extract: ground_truth=0.85 | measured=0.856
| success=✅
ceval-valid | acc,none: ground_truth=0.84 | measured=0.8373 | success=✅
Model Parameters:
{'pretrained': 'Qwen/Qwen3-30B-A3B', 'tensor_parallel_size': 2, 'dtype':
'auto', 'trust_remote_code': False, 'max_model_len': 4096,
'gpu_memory_utilization': 0.6, 'enable_expert_parallel': True}

- vLLM version: v0.13.0
- vLLM main:
45c1ca1ca1

Signed-off-by: shenchuxiaofugui <1311027364@qq.com>

2025-12-31 09:19:04 +08:00

policy

[Misc] Cleanup useless print and logger (#5220 )

2025-12-22 11:28:26 +08:00

__init__.py

Dynamic Expert Load Balance with Zero-like-overhead (#2956 )

2025-09-17 10:36:43 +08:00

eplb_device_transfer_loader.py

[CI]Fix oom of deepseek-eplb nigtly test. (#3884 )

2025-10-30 10:18:07 +08:00

eplb_utils.py

[smoke][bugfix] moe_init_routing_v2 active_expert_range use int type (#5521 )

2025-12-31 09:19:04 +08:00

eplb_worker.py

[EPLB][refactor] Modification of the initialization logic for expert_map and log2phy（depend on pr5285） (#5311 )

2025-12-29 09:26:14 +08:00