[main][bugfix] bugfix for qwen3 moe quantization (#4599)
### What this PR does / why we need it? Fix the issue where the qwen3 moe service cannot be started due to upgrading the vllm version Error info: AttributeError: 'AscendFusedMoE' object has no attribute 'use dp chunking' ### Does this PR introduce _any_ user-facing change? no - vLLM version: v0.11.2 --------- Signed-off-by: Wang Kunpeng <1289706727@qq.com>
This commit is contained in:
@@ -408,11 +408,10 @@ class AscendFusedMoEMethod(FusedMoEMethodBase):
|
||||
quant_config: The Ascend quantization config.
|
||||
"""
|
||||
|
||||
def __init__(self,
|
||||
quant_config: AscendQuantConfig,
|
||||
prefix: str,
|
||||
packed_modules_mapping: Dict[str, Any],
|
||||
layer: torch.nn.Module = None):
|
||||
def __init__(self, quant_config: AscendQuantConfig, prefix: str,
|
||||
packed_modules_mapping: Dict[str,
|
||||
Any], layer: torch.nn.Module):
|
||||
super().__init__(layer.moe_config)
|
||||
self.quant_method = get_quant_method(quant_config.quant_description,
|
||||
prefix,
|
||||
"moe",
|
||||
|
||||
Reference in New Issue
Block a user