Commit Graph

18 Commits

Author SHA1 Message Date
Xinyuan Tong
cf9815ba69 [Refactor] Multimodal data processing for VLM (#6659)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-06-04 11:22:33 -07:00
Xinyuan Tong
681fdc264b Refactor vlm embedding routine to use precomputed feature (#6543)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-05-24 18:39:21 -07:00
Mick
01dd39bac1 refactor: minor refactors regarding multimodal processing (#6187) 2025-05-17 22:53:20 -07:00
Yury Sulsky
f19a9204cd Support precomputed multimodal features for Qwen-VL and Gemma3 models. (#6136)
Co-authored-by: Yury Sulsky <ysulsky@tesla.com>
2025-05-16 12:26:15 -07:00
Zhu Chen
fa7d7fd9e5 [Feature] Add FlashAttention3 as a backend for VisionAttention (#5764)
Co-authored-by: othame <chenzhu_912@zju.edu.cn>
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: Yi Zhang <1109276519@qq.com>
2025-05-08 10:01:19 -07:00
Mick
c998d04b46 vlm: enable radix cache for qwen-vl models (#5349)
Co-authored-by: Xinyuan Tong <justinning0323@outlook.com>
2025-04-23 20:35:05 -07:00
yhyang201
072df75354 Support for Qwen2.5-VL Model in bitsandbytes Format (#5003) 2025-04-14 02:03:40 -07:00
Mick
34ef6c8135 [VLM] Adopt fast image processor by default (#5065) 2025-04-11 21:46:58 -07:00
Mick
e53a0b3d5b [fix] fix mrope positions not picked up (#5265) 2025-04-11 01:29:45 -07:00
Mick
5cb552b1d4 refactor: multimodal data (#4754) 2025-03-31 09:57:51 -07:00
Mick
1e86457c90 model: Minicpmo (#3023) 2025-03-24 20:08:40 -07:00
Mick
11577cedb7 refactor: bug fixes and refactor for vlm (#4661) 2025-03-22 22:48:49 -07:00
Adarsh Shirawalmath
f8f9244a61 [Bug Fix] Add partial rotary factor support for Phi-4 and upgrade to transformers v4.50.0 (#3984)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
2025-03-22 14:27:39 -07:00
Mick
d373a48c98 fix: second_per_grid_ts should be used to get mrope position (#3682) 2025-03-17 18:12:38 -07:00
Mick
ff2ce0b86f refactor: move image processors to separate files (#4229) 2025-03-11 12:35:35 -07:00
shimin
ac69885056 fix the input_ids is None error (#4144) 2025-03-10 01:38:37 -07:00
Qubitium-ModelCloud
56a724eba3 [QUANT] Add GPTQModel Dynamic Quantization + lm_head Quantization (#3790)
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
2025-03-05 01:11:00 -08:00
Mick
bcc213df61 Model: Support Qwen 2.5 vl (#3258) 2025-02-16 00:58:53 -08:00