[v0.18.0][BugFix] fix the weightsmapper bug of qwen3-vl (#7868)
### What this PR does / why we need it? This PR fixes a weight loading error in the Qwen3-VL model. The bug was introduced by vLLM. In vLLM's `qwen3-vl.py`, the prefix of the `lm_head` layer is hardcoded as `"lm_head"`. However, `hf_to_vllm_mapper` remaps the weight name of `lm_head` from `lm_head` to `language_model.lm_head`. This causes a mismatch between the keys in the weight file and the prefix of the lm_head layer, resulting in an error. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - [x] Run Qwen3-VL dense model with the fusion operator, verify correct output Signed-off-by: betta18 <jiangmengyu1@huawei.com> Co-authored-by: betta18 <jiangmengyu1@huawei.com>
This commit is contained in:
@@ -455,6 +455,10 @@ class AscendModelSlimConfig(QuantizationConfig):
|
||||
parts = parts[: exp_idx + 1]
|
||||
prefix = ".".join(parts)
|
||||
|
||||
# TODO: remove it when vllm fixes the WeightsMapper bug of qwen3-vl.
|
||||
if model_type in ["qwen3_vl"] and prefix == "lm_head":
|
||||
prefix = "language_model.lm_head"
|
||||
|
||||
if model_type in packed_modules_model_mapping:
|
||||
self.packed_modules_mapping = packed_modules_model_mapping[model_type]
|
||||
prefix = self.quant_prefix_mapper(model_type, prefix)
|
||||
|
||||
Reference in New Issue
Block a user