[main][bugfix] Fix MatmulNZ format bug on some machines (#2549)

### What this PR does / why we need it?
This PR fixes the bug on some machines where quantmatmul failed to run
with the NZ format. The change ensures proper execution under the
expected data layout.

### How was this patch tested?
CI passed with existing test.

- vLLM version: v0.10.1.1
- vLLM main:
b5d34af328

Signed-off-by: rjg-lyh <1318825571@qq.com>
This commit is contained in:
rjg-lyh
2025-08-27 09:08:17 +08:00
committed by GitHub
parent 042605f4b2
commit 358ba68994

View File

@@ -112,6 +112,9 @@ import torch_npu
import vllm_ascend.envs as envs_ascend
# if true, allow tensor initialization and casting with internal format (e.g., NZ)
torch.npu.config.allow_internal_format = True
if is_310p():
torch_npu.npu.set_compile_mode(jit_compile=False)
ACL_FORMAT = ACL_FORMAT_FRACTAL_NZ