sglang

Author	SHA1	Message	Date
Lianmin Zheng	03d5fbfd44	Release 0.4.1.post3 - upload the config.json to PyPI (#2647 )	2024-12-29 14:25:53 -08:00
Yineng Zhang	3ccf566b0d	chore: bump v0.4.1.post2 (#2643 )	2024-12-30 00:11:46 +08:00
Yineng Zhang	ef5b0ff90b	chore: bump v0.4.1.post1 (#2616 )	2024-12-28 00:11:06 +08:00
HandH1998	6e5305158c	update sgl_moe_align_block_size usage (#2617 )	2024-12-28 00:01:13 +08:00
yudian0504	531d6ea968	fix: package data missing (#2521 )	2024-12-26 08:16:48 -08:00
Yineng Zhang	635a042623	docs: update deepseek v3 example (#2592 )	2024-12-26 17:43:37 +08:00
Yineng Zhang	efc52f85e2	chore: bump v0.4.1 (#2582 )	2024-12-26 07:14:51 +08:00
Yineng Zhang	60e2fdcf4f	use sgl-kernel moe_align_block_size (#2581 ) Co-authored-by: ispobock <ispobaoke@163.com> Co-authored-by: HandH1998 <1335248067@qq.com>	2024-12-26 06:29:08 +08:00
Yineng Zhang	8f4d04e540	chore: bump v0.4.0.post2 (#2525 )	2024-12-21 21:16:34 +08:00
Jerry Zhang	feb2b768ba	Add integration with gemlite weight only quant (#2528 )	2024-12-21 00:25:25 +08:00
Yineng Zhang	4b83db24f1	fix: continue to use flashinfer 0.1.6 temporarily (#2517 )	2024-12-19 14:03:24 +08:00
Yineng Zhang	626a99ac13	chore: update ao v0.7.0 (#2447 )	2024-12-11 04:44:28 -08:00
Lianmin Zheng	641b7d0ae0	[Minor] Improve code style (#2422 )	2024-12-09 06:30:35 -08:00
SangBin Cho	1f09e84b9a	nit: Remove busy waiting on scheduler (#2382 )	2024-12-08 01:06:15 -08:00
Yineng Zhang	aaac33fd8d	fix: update xgrammar v0.1.6 (#2390 )	2024-12-07 21:09:16 +08:00
Lianmin Zheng	e5f227c0ee	Release v0.4.0.post1 (#2375 )	2024-12-06 06:08:19 -08:00
Yineng Zhang	2db4469808	minor: limit the range of vllm versions (#2350 )	2024-12-05 02:00:34 +08:00
Yineng Zhang	f8b0326934	chore: bump v0.4.0 (#2338 )	2024-12-03 11:55:41 -08:00
Yineng Zhang	fae4e5e99a	chore: bump v0.3.6.post3 (#2259 )	2024-11-30 01:41:16 +08:00
Lianmin Zheng	fed4c6946a	Release v0.3.6.post2 (#2214 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-11-27 03:35:30 -08:00
Yineng Zhang	bc1f6fda0d	fix: add cuda-python for xgrammar (#2199 )	2024-11-26 17:24:18 +08:00
Lianmin Zheng	ac5a0f0488	Release v0.3.6.post1 (#2189 )	2024-11-25 17:31:37 -08:00
Lianmin Zheng	1605ae121e	[CI] Minor fix for CI (#2187 )	2024-11-25 16:38:43 -08:00
Yixin Dong	7f076c2ce6	Update XGrammar to the latest API (#2176 ) Co-authored-by: Ben Gitter <gitterbd@gmail.com>	2024-11-25 15:58:30 -08:00
Ankur Neog	865233e256	Add initial support for intel Gaudi accelerators (#2121 )	2024-11-22 20:22:23 -08:00
Yineng Zhang	2797bc3422	fix: add xgrammar dependency (#2126 )	2024-11-22 20:53:11 +08:00
Yineng Zhang	9a00e6f453	chore: bump v0.3.6 (#2120 )	2024-11-22 19:27:30 +08:00
Lianmin Zheng	dfec7fca06	Rename sglang.bench_latency to sglang.bench_one_batch (#2118 )	2024-11-21 20:07:48 -08:00
Yineng Zhang	766192610e	feat: update torch 2.5.1 (#2069 )	2024-11-18 21:29:13 +08:00
Lianmin Zheng	c1f401fc58	Revert "chore: update torch v2.5.1" (#2063 )	2024-11-17 15:29:38 -08:00
Yineng Zhang	3b878863f7	chore: update torch v2.5.1 (#1849 )	2024-11-18 00:06:00 +08:00
Lianmin Zheng	32c9a7ec11	Release v0.3.5.post2 (#2046 )	2024-11-15 06:54:00 -08:00
Lianmin Zheng	a10d530943	Fix outlines version (#2036 )	2024-11-14 12:52:40 -08:00
Lianmin Zheng	f407fcf9ef	Release v0.3.5.post1 (#2022 )	2024-11-13 10:27:12 -08:00
Yineng Zhang	b3523af8eb	fix: update pyzmq version (#1983 )	2024-11-10 21:33:23 +08:00
Huanzhi (Hans) Mao	ed53ac84b4	Specify `zmq` Version Requirement (#1982 )	2024-11-10 01:32:07 -08:00
Yudi Xue	95a4ed129a	Fix metrics (#1963 )	2024-11-08 23:21:11 -08:00
Lianmin Zheng	a509552087	[minor] Improve code style and compatibility (#1961 )	2024-11-08 02:19:41 -08:00
Lianmin Zheng	65859754f1	Release v0.3.5 (#1908 )	2024-11-03 13:48:11 -08:00
HAI	d8e9d61f86	[Build, ROCm] Dockerfile.rocm for Instinct GPUs, with package updates (#1861 )	2024-10-31 16:38:16 -07:00
Chayenne	539df95d2c	Imporve openai api documents (#1827 ) Co-authored-by: Chayenne <zhaochenyang@g.ucla.edu>	2024-10-30 00:39:41 -07:00
Lianmin Zheng	30643fed7f	Release v0.3.4.post2 (#1796 ) Co-authored-by: DarkSharpness <76582120+DarkSharpness@users.noreply.github.com>	2024-10-25 11:07:19 -07:00
Lianmin Zheng	1f26e8b8e4	Release v0.3.4.post1 (#1749 )	2024-10-21 21:16:43 -07:00
Liangsheng Yin	94cde10920	Llama3.2 vision model support (#1551 )	2024-10-21 15:01:21 -07:00
Yineng Zhang	8bee20f80b	Update vllm to 0.6.3 (#1711 ) (#1720 ) Co-authored-by: Ke Bao <ISPObaoke@163.com>	2024-10-19 20:45:41 -07:00
Lianmin Zheng	087257ea03	Release v0.3.4 (#1714 )	2024-10-19 08:17:41 -07:00
Michael Feil	b0facb3316	add orjson for jsonresponse (#1688 )	2024-10-16 18:14:30 -07:00
Ke Bao	d10b933a36	Fix srt dependency (#1685 )	2024-10-16 08:21:20 -07:00
Zhang, Liangang	5d638c92f5	[Feature, Hardware] Enable SGLang on XPU GPUs via PyTorch (#1480 )	2024-10-12 18:10:32 +00:00
Lianmin Zheng	00c7e6368b	Release v0.3.3.post1 (#1636 )	2024-10-11 07:56:16 -07:00

1 2 3

142 Commits