Commit Graph

8 Commits

Author SHA1 Message Date
JohnJan
cfdd45ed00 [Bug] Fix duplicate 'torch.' prefix in qwen-vl (#1986)
Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

### What this PR does / why we need it?
Fix duplicate 'torch.' prefix in qwen2-vl, qwen2.5-vl

- vLLM version: v0.9.2
- vLLM main:
dde295a934
2025-07-24 20:16:00 +08:00
JohnJan
fa76a9b7bb [Bug] Add prefix parameter to parent class initialization (#1934)
Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

### What this PR does / why we need it?
Add prefix parameter to parent class initialization to avoid parameter
naming conflicts

### Does this PR introduce _any_ user-facing change?
No


- vLLM version: v0.9.2
- vLLM main:
32142b3c62
2025-07-24 10:28:40 +08:00
zouyida2052
ba9714ccee Optimize qwen2_vl and qwen2_5_vl (#701)
### What this PR does / why we need it?
Optimize qwen2_vl and qwen2_5_vl.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
Testing this PR on 1080p picture with tp=1, bs=1 on Qwen2-VL and
Qwen2.5-VL, every fa op's during time lasting from 11ms to 9ms, got
roughly 22% perf boost.

---------

Signed-off-by: zouyida2052 <zouyida@huawei.com>
Signed-off-by: zouyida2052 <zouyida2002@gmail.com>
Co-authored-by: zouyida2052 <zouyida@huawei.com>
2025-04-30 14:22:38 +08:00
Yikun Jiang
2e20797934 [BUILD] Upgrade torch-npu to 2.5.1 (#661)
### What this PR does / why we need it?
The torch-npu 2.5.1 are published:
https://pypi.org/project/torch-npu/2.5.1/
It's time to remove all torch-npu dev version from vllm-ascend code base

### Does this PR introduce _any_ user-facing change?
Yes, using torch-npu 2.5.1

### How was this patch tested?
- [ ] CI passed
- [ ] Manually test
- [ ] Grep all `dev2025`

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-04-27 17:28:29 +08:00
hfadzxy
9935d45728 [CI]Add model basic accuracy test(Qwen2.5-0.5B-Instruct) (#460)
### What this PR does / why we need it?
Add model basic accuracy test(Qwen2.5-0.5B-Instruct)

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-04-17 14:59:56 +08:00
BAI Fan
122505208f FastPatch: Optimized Patch Embedding for Qwen2VL (#345)
### What this PR does / why we need it?
We proposed the FastPatch method, which optimized patch embedding
(Conv3D) for Qwen2VL.


### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
We've tested it on benchmark, it meets our satisfaction and is better
than original patch_embed layer.


---------

Signed-off-by: baifanxxx <baifanxxx@gmail.com>
Signed-off-by: zouyida <zouyida@huawei.com>
Co-authored-by: zouyida <zouyida@huawei.com>
2025-03-26 14:28:20 +08:00
zouyida2002
12aa7115b5 bugfix for qwen2_vl (#301)
### What this PR does / why we need it?
this pr fixes the error while inferring Qwen2_VL.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
We've tested it on benchmark, it meets our satisfaction and is equal to
gpu.
---------

Signed-off-by: zouyida <zouyida@huawei.com>
2025-03-12 08:39:50 +08:00
zouyida2002
faf8cd89cb register qwen2_vl to rewrite qwen2_vl forwad (#241)
Add qwen2-vl ascend impletation.

---------
Signed-off-by: zouyida <zouyida@huawei.com>
2025-03-07 15:41:47 +08:00