xc-llm-ascend

Files

huangxialu 6881c19458 [main] convert the format of gmm to nz (#2474 )

### What this PR does / why we need it?
convert the format of gmm to nz

### Does this PR introduce _any_ user-facing change?
not involved

### How was this patch tested?
ut: test_fused_ops.py and e2e: test_fused_moe.py

**performance**:
(qwen3 30B, 2k->20k)

base:
Total Token throughput (tok/s):          719.93

gmm nz:
Total Token throughput (tok/s):          728.52


- vLLM version: v0.10.1.1
- vLLM main:
bfc1edc9f5

Signed-off-by: huangxialu <huangxialu1@huawei.com>

2025-08-27 11:25:02 +08:00

expert_map.json

Add unit test local cpu guide and enable base testcase (#1566 )

2025-07-06 10:42:27 +08:00

test_activation.py

[1/N][CustomOp] Register activation customop instead of overwrite forward_oot (#1841 )

2025-07-18 23:07:14 +08:00

test_expert_load_balancer.py

Add unit test local cpu guide and enable base testcase (#1566 )

2025-07-06 10:42:27 +08:00

test_fused_ops.py

[main] convert the format of gmm to nz (#2474 )