Files
xc-llm-ascend/tests/ut/quantization
zzzzwwjj 052e472453 [bugfix] fix w8a8dynamic fused_moe trans nz (#5199)
### What this PR does / why we need it?
Currently, `torch_npu.npu_grouped_matmul_swiglu_quant` can only support
weight nz, so we need to trans w13_weight, w2_weight to nz forcely.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: zzzzwwjj <1183291235@qq.com>
2025-12-22 17:45:34 +08:00
..