sglang

Author	SHA1	Message	Date
Xinyuan Tong	cf9815ba69	[Refactor] Multimodal data processing for VLM (#6659 ) Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>	2025-06-04 11:22:33 -07:00
Xinyuan Tong	681fdc264b	Refactor vlm embedding routine to use precomputed feature (#6543 ) Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>	2025-05-24 18:39:21 -07:00
Mick	01dd39bac1	refactor: minor refactors regarding multimodal processing (#6187 )	2025-05-17 22:53:20 -07:00
Yury Sulsky	f19a9204cd	Support precomputed multimodal features for Qwen-VL and Gemma3 models. (#6136 ) Co-authored-by: Yury Sulsky <ysulsky@tesla.com>	2025-05-16 12:26:15 -07:00
Zhu Chen	fa7d7fd9e5	[Feature] Add FlashAttention3 as a backend for VisionAttention (#5764 ) Co-authored-by: othame <chenzhu_912@zju.edu.cn> Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: Yi Zhang <1109276519@qq.com>	2025-05-08 10:01:19 -07:00
Mick	c998d04b46	vlm: enable radix cache for qwen-vl models (#5349 ) Co-authored-by: Xinyuan Tong <justinning0323@outlook.com>	2025-04-23 20:35:05 -07:00
yhyang201	072df75354	Support for Qwen2.5-VL Model in bitsandbytes Format (#5003 )	2025-04-14 02:03:40 -07:00
Mick	34ef6c8135	[VLM] Adopt fast image processor by default (#5065 )	2025-04-11 21:46:58 -07:00
Mick	e53a0b3d5b	[fix] fix mrope positions not picked up (#5265 )	2025-04-11 01:29:45 -07:00
Mick	5cb552b1d4	refactor: multimodal data (#4754 )	2025-03-31 09:57:51 -07:00
Mick	1e86457c90	model: Minicpmo (#3023 )	2025-03-24 20:08:40 -07:00
Mick	11577cedb7	refactor: bug fixes and refactor for vlm (#4661 )	2025-03-22 22:48:49 -07:00
Adarsh Shirawalmath	f8f9244a61	[Bug Fix] Add partial rotary factor support for Phi-4 and upgrade to transformers v4.50.0 (#3984 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-03-22 14:27:39 -07:00
Mick	d373a48c98	fix: second_per_grid_ts should be used to get mrope position (#3682 )	2025-03-17 18:12:38 -07:00
Mick	ff2ce0b86f	refactor: move image processors to separate files (#4229 )	2025-03-11 12:35:35 -07:00
shimin	ac69885056	fix the input_ids is None error (#4144 )	2025-03-10 01:38:37 -07:00
Qubitium-ModelCloud	56a724eba3	[QUANT] Add GPTQModel Dynamic Quantization + `lm_head` Quantization (#3790 ) Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>	2025-03-05 01:11:00 -08:00
Mick	bcc213df61	Model: Support Qwen 2.5 vl (#3258 )	2025-02-16 00:58:53 -08:00

18 Commits