[BugFix] fix 3vl dense model load quant weight (#6100)

### What this PR does / why we need it? Fix Qwen3VL dense quant model load weights Error. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? The Qwen3VL quantized model service initialized successfully. Inference requests are processed correctly, and valid responses are returned. - vLLM version: v0.13.0 - vLLM main: d68209402d Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>
2026-01-22 20:05:25 +08:00
parent 7f91ac2649
commit 176bfc36bc
1 changed files with 5 additions and 0 deletions
--- a/vllm_ascend/quantization/quant_config.py
+++ b/vllm_ascend/quantization/quant_config.py
@@ -210,6 +210,11 @@ QUANT_MODEL_PREFIX_MAPPINGS = {
        "language_model.lm_head.": "lm_head.",
        "language_model.model.": "model.language_model.",
    },
+    "qwen3_vl_text": {
+        "visual.": "model.visual.",
+        "language_model.lm_head.": "lm_head.",
+        "language_model.model.": "model.language_model.",
+    },
 }

 packed_modules_model_mapping = {