[BugFix] Fixes Qwen3-Next enable nz accuracy problem (#4058)
### What this PR does / why we need it?
- Fixes Qwen3-Next enable nz accuracy problem
### Does this PR introduce _any_ user-facing change?
N/A
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
---------
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: wxsIcey <1790571317@qq.com>
This commit is contained in:
@@ -260,6 +260,7 @@ class TestAscendW4A8DynamicFusedMoEMethod(TestBase):
|
||||
requires_grad=False)
|
||||
return layer
|
||||
|
||||
@patch('vllm_ascend.utils._ENABLE_NZ', False)
|
||||
@patch('torch_npu.npu_format_cast')
|
||||
@patch('torch_npu.npu_quantize')
|
||||
@patch('torch.Tensor.npu')
|
||||
|
||||
Reference in New Issue
Block a user