Files
xc-llm-ascend/vllm_ascend
zouyida2052 ba9714ccee Optimize qwen2_vl and qwen2_5_vl (#701)
### What this PR does / why we need it?
Optimize qwen2_vl and qwen2_5_vl.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
Testing this PR on 1080p picture with tp=1, bs=1 on Qwen2-VL and
Qwen2.5-VL, every fa op's during time lasting from 11ms to 9ms, got
roughly 22% perf boost.

---------

Signed-off-by: zouyida2052 <zouyida@huawei.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Co-authored-by: zouyida2052 <zouyida@huawei.com>
2025-04-30 14:22:38 +08:00
..
2025-04-29 11:14:19 +08:00
2025-04-22 08:57:25 +08:00
2025-04-30 09:15:50 +08:00
2025-04-30 09:15:50 +08:00
2025-04-30 09:15:50 +08:00
2025-04-29 18:03:38 +08:00