xc-llm-ascend

Files

zzzzwwjj 052e472453 [bugfix] fix w8a8dynamic fused_moe trans nz (#5199 )

### What this PR does / why we need it?
Currently, `torch_npu.npu_grouped_matmul_swiglu_quant` can only support
weight nz, so we need to trans w13_weight, w2_weight to nz forcely.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: zzzzwwjj <1183291235@qq.com>

2025-12-22 17:45:34 +08:00

test_quant_config.py

[2/N][Pangu][MoE] Remove Pangu Related Code (#5130 )

2025-12-19 09:00:07 +08:00

test_utils.py

[2/N][Pangu][MoE] Remove Pangu Related Code (#5130 )

2025-12-19 09:00:07 +08:00

test_w4a4_flatquant_dynamic.py

[refactor] refactor weight trans nz and transpose (#4878 )

2025-12-19 14:27:24 +08:00

test_w4a8_dynamic.py

[refactor] refactor weight trans nz and transpose (#4878 )