sglang

Author	SHA1	Message	Date
Qubitium-ModelCloud	56a724eba3	[QUANT] Add GPTQModel Dynamic Quantization + `lm_head` Quantization (#3790 ) Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>	2025-03-05 01:11:00 -08:00
Lianmin Zheng	641b7d0ae0	[Minor] Improve code style (#2422 )	2024-12-09 06:30:35 -08:00
Yineng Zhang	85e1a6f3aa	Update model_loader deps and qqq quantization deps (#2220 ) (#2318 ) Co-authored-by: HandH1998 <1335248067@qq.com>	2024-12-02 23:22:13 +08:00
Xuehai Pan	62a4a339eb	docs: fix module docstrings and copyright headers (#2077 )	2024-11-22 22:16:53 +08:00
Lianmin Zheng	59a5ba9be0	[Minor] Remove unused imports (#2006 )	2024-11-11 15:36:14 -08:00
aqweteddy	4ade15dd32	Adjust reward model's score module and pooler module order for reducing computation (#1956 )	2024-11-08 00:10:54 -08:00
Lianmin Zheng	2ce32db6fb	Let reward model take text inputs instead of message lists (#1907 ) Co-authored-by: Kyle Corbitt <kyle@corbt.com>	2024-11-03 13:27:12 -08:00
Byron Hsu	56503d9bc9	[1/N] Remove `CacheConfig` import in all model files (#1658 )	2024-10-14 09:06:34 -07:00
Lianmin Zheng	36d5acfca5	Rename InputMetadata -> ForwardBatch (#1543 )	2024-09-30 02:41:11 -07:00
Ying Sheng	9aa6553d2a	[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B (#1525 )	2024-09-27 23:32:11 -07:00