[0.11.0][Perf] Add padding vision tower for Qwen2_5_Omni (#4041)
### What this PR does / why we need it? This PR repalce the vision tower in Qwen2.5-Omni-Thinker model, Qwen2_5_VisionTransformer, with AscendQwen2_5_VisionTransformer, which use QKV padding for padding performance. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: Ting FU <futing10@huawei.com>
This commit is contained in:
@@ -23,6 +23,10 @@ def register_model():
|
||||
"Qwen2_5_VLForConditionalGeneration",
|
||||
"vllm_ascend.models.qwen2_5_vl:AscendQwen2_5_VLForConditionalGeneration"
|
||||
)
|
||||
ModelRegistry.register_model(
|
||||
"Qwen2_5OmniModel",
|
||||
"vllm_ascend.models.qwen2_5_omni_thinker:AscendQwen2_5OmniThinkerForConditionalGeneration"
|
||||
)
|
||||
else:
|
||||
ModelRegistry.register_model(
|
||||
"Qwen2_5_VLForConditionalGeneration",
|
||||
|
||||
Reference in New Issue
Block a user