Files
xc-llm-ascend/vllm_ascend/quantization
zhangxinyuehfad 0d094531b4 [bugfix] Fixed the bug in retrieving the quantization method for mlp.… (#4797)
When retrieving the quantization method for MOE (e.g., the quantization
file of DeepSeek v3.2 exp do not match the model's naming convention in
eager mode), a KeyError is raised: "model.layers.3.mlp.experts.weight
not in self.quant_description". However the quantization file is like :
```bash
  "model.layers.3.mlp.experts.255.gate_proj.weight": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.gate_proj.weight_scale": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.gate_proj.weight_offset": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.down_proj.weight": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.down_proj.weight_scale": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.down_proj.weight_offset": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.up_proj.weight": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.up_proj.weight_scale": "W8A8_DYNAMIC",
  "model.layers.3.mlp.experts.255.up_proj.weight_offset": "W8A8_DYNAMIC",
```

Co-Authored-By: yangqinghao-cmss <yangqinghao_yewu@cmss.chinamobile.com>

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Co-authored-by: yangqinghao-cmss <yangqinghao_yewu@cmss.chinamobile.com>
2025-12-09 08:47:19 +08:00
..