sglang

Author	SHA1	Message	Date
Yineng Zhang	496dde8491	bump sgl-kernel 0.0.8 (#5089 )	2025-04-05 14:28:04 -07:00
Yineng Zhang	d7954b7682	bump sgl-kernel v0.0.7 (#5046 )	2025-04-03 13:38:13 -07:00
Yineng Zhang	6384d31776	bump sgl-kernel v0.0.6 (#4950 )	2025-03-31 11:24:09 -07:00
Yineng Zhang	92941ce7b5	bump sgl-kernel 0.0.5.post4 (#4768 )	2025-03-28 14:40:53 -07:00
Yineng Zhang	8bf6d7f406	support cmake for sgl-kernel (#4706 ) Co-authored-by: hebiao064 <hebiaobuaa@gmail.com> Co-authored-by: yinfan98 <1106310035@qq.com>	2025-03-27 01:42:28 -07:00
Yineng Zhang	988ab646ec	bump v0.0.5.post3 (#4520 )	2025-03-17 13:05:59 -07:00
Lianmin Zheng	3db35c1af4	Release sgl-kernel v0.0.5.post2 (#4469 )	2025-03-16 01:01:53 -07:00
Yineng Zhang	862fe52241	bump v0.0.5.post1 (#4437 )	2025-03-14 15:00:26 -07:00
Yineng Zhang	4ff1264201	Update pyproject.toml	2025-03-13 02:16:51 -07:00
Yineng Zhang	2a4cbad8e9	bump 0.0.5 sgl-kernel (#4377 )	2025-03-13 02:08:35 -07:00
Yineng Zhang	6e7239f912	release 0.0.4.post3 sgl-kernel (#4331 )	2025-03-12 01:05:16 -07:00
Yineng Zhang	cd90945518	bump sgl-kernel 0.0.4.post2 (#4288 )	2025-03-11 00:09:47 -07:00
Lianmin Zheng	1a5023e05d	Release sgl-kernel v0.0.4.post1 (#4255 )	2025-03-10 02:39:50 -07:00
laixin	c553e1604c	DeepGemm integrate to sgl-kernel (#4165 ) Co-authored-by: sleepcoo <sleepcoo@gmail.com> Co-authored-by: HandH1998 <1335248067@qq.com> Co-authored-by: shuaills <shishuaiuoe@gmail.com> Co-authored-by: yinfan98 <1106310035@qq.com> Co-authored-by: Yineng Zhang <me@zhyncs.com>	2025-03-10 00:35:07 -07:00
Lianmin Zheng	eb06dbcbf8	Move rope and bmm into sgl-kernel (#4241 )	2025-03-09 18:38:15 -07:00
Yineng Zhang	5c7dd14ba1	chore: bump v0.0.4 for sgl-kernel (#4223 )	2025-03-08 23:01:59 -08:00
Lianmin Zheng	8abf74e3c9	Rename files in sgl kernel to avoid nested folder structure (#4213 ) Co-authored-by: zhyncs <me@zhyncs.com>	2025-03-08 22:54:51 -08:00
Yineng Zhang	96263f275c	chore: bump v0.0.3.post7 for sgl-kernel (#4176 )	2025-03-07 01:15:34 -08:00
Chayenne	18bb216c28	Revert "[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x)" (#3982 )	2025-02-28 23:57:17 -08:00
yiakwy-xpu-ml-framework-team	1c96fa86cf	[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x) (#3613 )	2025-02-27 19:42:48 -08:00
Yineng Zhang	e082142519	chore: bump 0.0.3.post6 sgl-kernel (#3555 )	2025-02-14 08:55:15 +08:00
Yineng Zhang	4430c0a513	chore: bump 0.0.3.post5 sgl-kernel (#3530 )	2025-02-13 01:51:46 +08:00
Yineng Zhang	b96e92e6e6	chore: bump 0.0.3.post4 sgl-kernel (#3523 )	2025-02-12 17:28:36 +08:00
Yineng Zhang	6239d0b2e7	chore: bump sgl-kernel v0.0.3.post3 (#3440 )	2025-02-10 04:00:52 +08:00
Yineng Zhang	f9905d59a8	support speculative decoding kernel in sgl-kernel (#3373 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2025-02-07 20:29:51 +08:00
Yineng Zhang	c38b5fb4f4	update 3rdparty and rms norm for sgl-kernel (#3213 )	2025-01-30 19:32:21 +08:00
Yineng Zhang	8a96f74988	chore: bump 0.0.3 for sgl-kernel (#3178 ) Co-authored-by: ispobock <ispobaoke@hotmail.com> Co-authored-by: BBuf <35585791+BBuf@users.noreply.github.com> Co-authored-by: HandH1998 <007aabbcc411@gmail.com> Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: ByronHsu <byronhsu1230@gmail.com>	2025-01-27 20:29:28 +08:00
Byron Hsu	514f37c32b	[kernel] Fix position ids in rope (#3173 )	2025-01-27 17:09:51 +08:00
Byron Hsu	741fccd7bf	Bump sgl kernel to 0.0.2.post19 (#3167 )	2025-01-27 15:36:07 +08:00
Yineng Zhang	318260c0fa	chore: bump 0.0.2.post18 for sgl-kernel (#3149 )	2025-01-26 19:00:34 +08:00
Yineng Zhang	896c07441e	update installation doc for sgl-kernel (#3129 )	2025-01-26 00:00:13 +08:00
Yineng Zhang	14e754a868	chore: bump v0.0.2.post17 for sgl-kernel (#3125 )	2025-01-25 20:43:02 +08:00
Yineng Zhang	54bac8af0b	chore: bump sgl-kernel 0.0.2.post16 (#3087 )	2025-01-24 01:57:48 +08:00
Yineng Zhang	1f6cf0d4b9	fix build error for sgl-kernel (#3078 )	2025-01-23 19:16:35 +08:00
Lianmin Zheng	553f5a3ffe	Remove torch dependency in sgl-kernel (#3074 )	2025-01-23 17:23:37 +08:00
Byron Hsu	b5caa22dfb	[kernel] port rope cuda kernel to sgl-kernel (#2993 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2025-01-20 20:58:51 +08:00
Yineng Zhang	a53454c55e	fix: sgl-kernel link cuda (#2906 )	2025-01-16 04:53:23 +08:00
yizhang2077	6cb3974e77	optimize custom allreduce kernel (#2904 )	2025-01-16 03:04:25 +08:00
Xiaoyu Zhang	e2b16c4716	add sampling_scaling_penalties kernel (#2846 )	2025-01-12 19:38:17 -08:00
yizhang2077	3900a94afe	Support twoshot kernel (#2688 )	2025-01-06 00:47:16 +08:00
HandH1998	77d1210b36	fix moe_align_block_size (#2615 )	2024-12-27 23:32:53 +08:00
Yineng Zhang	2dccecf432	fix: only enable moe_align_block_size for now (#2590 )	2024-12-26 16:56:59 +08:00
Yineng Zhang	d7c0e872b0	chore: bump 0.0.2.post8 for sgl-kernel (#2580 )	2024-12-26 06:11:39 +08:00
Yineng Zhang	31548116a8	fix moe_align_block_size_kernel for shared memory issue (#2579 ) Co-authored-by: ispobock <ispobaoke@163.com>	2024-12-26 05:31:04 +08:00
yizhang2077	e04d3f2897	adapt tensorrt llm custom all reduce to sgl-kernel (#2481 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-12-15 13:15:59 +08:00
Yineng Zhang	2673fa29d4	fix: set runtime path (#2466 )	2024-12-12 18:05:48 +08:00
Yineng Zhang	56fcd8e8a5	feat: support sgl-kernel PyPI (#2433 ) Co-authored-by: Zhangyi <1109276519@qq.com>	2024-12-11 06:06:19 +08:00
Yineng Zhang	47eb139f81	feat: use warp reduce as a simple example (#2304 )	2024-12-01 22:43:50 +08:00
Yineng Zhang	5c91a315d7	feat: support sgl-kernel pypi (#2302 )	2024-12-01 20:11:21 +08:00

49 Commits