sglang

Author	SHA1	Message	Date
Yuhong Guo	e5afb88b1c	Support weight loading without mmap (#7469 )	2025-06-23 15:13:59 -07:00
Baizhou Zhang	2a5f0100e0	Fix GGuf and add back test_gguf.py (#7067 )	2025-06-10 21:07:20 -07:00
fzyzcjy	73187152a4	Reland tiny refactor DefaultModelLoader.Source (#6041 )	2025-05-17 17:11:20 -07:00
fzyzcjy	6450c1228c	Tiny refactor weight loading logic (#5232 )	2025-05-08 01:02:56 -07:00
Lianmin Zheng	693723d1f7	Revert "Tiny refactor DefaultModelLoader.Source" (#5825 )	2025-04-28 01:18:57 -07:00
fzyzcjy	644ed409d1	Tiny refactor DefaultModelLoader.Source (#5482 )	2025-04-28 00:35:51 -07:00
ryang	bc24205b32	Support BNB quantization for llama/mllama (#5038 ) Co-authored-by: Yuhao Yang <yyh073@foxmail.com>	2025-04-15 18:00:31 -07:00
yhyang201	072df75354	Support for Qwen2.5-VL Model in bitsandbytes Format (#5003 )	2025-04-14 02:03:40 -07:00
HandH1998	4065248214	Support Llama4 fp8 inference (#5194 ) Co-authored-by: laixinn <xielx@shanghaitech.edu.cn> Co-authored-by: sleepcoo <sleepcoo@gmail.com> Co-authored-by: zhyncs <me@zhyncs.com>	2025-04-09 20:14:34 +08:00
inkcherry	7ed77d6b9e	fix dummy-load deepseekv2 (#4535 )	2025-04-04 15:22:37 -07:00
Juwan Yoo	188105a21b	deps: lazy import optional dependencies `gguf` and `torchvision` (#4826 )	2025-03-27 14:35:36 -07:00
huiwq1990	5cbd709ea1	Fix: modelscope env comment (#4474 ) Signed-off-by: huiwq1990 <huiwq1990@163.com>	2025-03-16 18:11:33 -07:00
wangyu	1ce4878d31	feat(remote_model): support variable remote backend for model loader (#3964 ) Signed-off-by: wangyu <wangyu.steph@bytedance.com>	2025-03-14 00:40:44 -07:00
Lianmin Zheng	45de89719c	Revert "[XPU][CPU] Enable the native path of DeepSeek" (#4367 )	2025-03-12 23:45:52 -07:00
Meng, Hengyu	71046fcd71	[XPU][CPU] Enable the native path of DeepSeek (#4086 ) Co-authored-by: Zhang, Liangang <liangang.zhang@intel.com>	2025-03-12 22:26:29 -07:00
Lianmin Zheng	aa957102a9	Simplify tests & Fix trtllm custom allreduce registration (#4252 )	2025-03-10 01:24:22 -07:00
Mick	583d6af71b	example: add vlm to token in & out example (#3941 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-03-04 22:18:26 -08:00
Ke Wen	862bcff833	Support loading of larger models with on-the-fly quantization (#3061 )	2025-01-22 21:33:17 -08:00
Yineng Zhang	5dc54f1a62	feat: remove vllm distributed (#2907 ) Co-authored-by: Zhangyi <1109276519@qq.com>	2025-01-17 22:31:51 +08:00
Sangchun Ha (Patrick)	08effbff35	Error occurs when loading the gemma model in bitsandbytes format. (#2557 )	2024-12-26 05:10:37 -08:00
Yineng Zhang	85e1a6f3aa	Update model_loader deps and qqq quantization deps (#2220 ) (#2318 ) Co-authored-by: HandH1998 <1335248067@qq.com>	2024-12-02 23:22:13 +08:00

21 Commits