[Bugfix] Correctly handle the output shape in multimodal attention (#5443)
### What this PR does / why we need it?
Fix https://github.com/vllm-project/vllm-ascend/issues/5297, for
`AscendMMEncoderAttention` forward, we should keep the output shape
consistence with the input
- vLLM version: release/v0.13.0
- vLLM main:
81786c8774
---------
Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
@@ -781,6 +781,11 @@ PROMPT_CONFIGS = {
|
||||
"fps": 1,
|
||||
},
|
||||
},
|
||||
"hunyuan-vl": {
|
||||
"model": "Tencent-Hunyuan/HunyuanOCR",
|
||||
"prompt_fn": hunyuan_prompt,
|
||||
"mm_processor_kwargs": {},
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user