sglang

Author	SHA1	Message	Date
Yineng Zhang	96263f275c	chore: bump v0.0.3.post7 for sgl-kernel (#4176 )	2025-03-07 01:15:34 -08:00
Chayenne	18bb216c28	Revert "[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x)" (#3982 )	2025-02-28 23:57:17 -08:00
yiakwy-xpu-ml-framework-team	1c96fa86cf	[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x) (#3613 )	2025-02-27 19:42:48 -08:00
Yineng Zhang	e082142519	chore: bump 0.0.3.post6 sgl-kernel (#3555 )	2025-02-14 08:55:15 +08:00
Yineng Zhang	4430c0a513	chore: bump 0.0.3.post5 sgl-kernel (#3530 )	2025-02-13 01:51:46 +08:00
Yineng Zhang	b96e92e6e6	chore: bump 0.0.3.post4 sgl-kernel (#3523 )	2025-02-12 17:28:36 +08:00
Yineng Zhang	6239d0b2e7	chore: bump sgl-kernel v0.0.3.post3 (#3440 )	2025-02-10 04:00:52 +08:00
Yineng Zhang	f9905d59a8	support speculative decoding kernel in sgl-kernel (#3373 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2025-02-07 20:29:51 +08:00
Yineng Zhang	c38b5fb4f4	update 3rdparty and rms norm for sgl-kernel (#3213 )	2025-01-30 19:32:21 +08:00
Yineng Zhang	8a96f74988	chore: bump 0.0.3 for sgl-kernel (#3178 ) Co-authored-by: ispobock <ispobaoke@hotmail.com> Co-authored-by: BBuf <35585791+BBuf@users.noreply.github.com> Co-authored-by: HandH1998 <007aabbcc411@gmail.com> Co-authored-by: yizhang2077 <1109276519@qq.com> Co-authored-by: ByronHsu <byronhsu1230@gmail.com>	2025-01-27 20:29:28 +08:00
Byron Hsu	514f37c32b	[kernel] Fix position ids in rope (#3173 )	2025-01-27 17:09:51 +08:00
Byron Hsu	741fccd7bf	Bump sgl kernel to 0.0.2.post19 (#3167 )	2025-01-27 15:36:07 +08:00
Yineng Zhang	318260c0fa	chore: bump 0.0.2.post18 for sgl-kernel (#3149 )	2025-01-26 19:00:34 +08:00
Yineng Zhang	896c07441e	update installation doc for sgl-kernel (#3129 )	2025-01-26 00:00:13 +08:00
Yineng Zhang	14e754a868	chore: bump v0.0.2.post17 for sgl-kernel (#3125 )	2025-01-25 20:43:02 +08:00
Yineng Zhang	54bac8af0b	chore: bump sgl-kernel 0.0.2.post16 (#3087 )	2025-01-24 01:57:48 +08:00
Yineng Zhang	1f6cf0d4b9	fix build error for sgl-kernel (#3078 )	2025-01-23 19:16:35 +08:00
Lianmin Zheng	553f5a3ffe	Remove torch dependency in sgl-kernel (#3074 )	2025-01-23 17:23:37 +08:00
Byron Hsu	b5caa22dfb	[kernel] port rope cuda kernel to sgl-kernel (#2993 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2025-01-20 20:58:51 +08:00
Yineng Zhang	a53454c55e	fix: sgl-kernel link cuda (#2906 )	2025-01-16 04:53:23 +08:00
yizhang2077	6cb3974e77	optimize custom allreduce kernel (#2904 )	2025-01-16 03:04:25 +08:00
Xiaoyu Zhang	e2b16c4716	add sampling_scaling_penalties kernel (#2846 )	2025-01-12 19:38:17 -08:00
yizhang2077	3900a94afe	Support twoshot kernel (#2688 )	2025-01-06 00:47:16 +08:00
HandH1998	77d1210b36	fix moe_align_block_size (#2615 )	2024-12-27 23:32:53 +08:00
Yineng Zhang	2dccecf432	fix: only enable moe_align_block_size for now (#2590 )	2024-12-26 16:56:59 +08:00
Yineng Zhang	d7c0e872b0	chore: bump 0.0.2.post8 for sgl-kernel (#2580 )	2024-12-26 06:11:39 +08:00
Yineng Zhang	31548116a8	fix moe_align_block_size_kernel for shared memory issue (#2579 ) Co-authored-by: ispobock <ispobaoke@163.com>	2024-12-26 05:31:04 +08:00
yizhang2077	e04d3f2897	adapt tensorrt llm custom all reduce to sgl-kernel (#2481 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-12-15 13:15:59 +08:00
Yineng Zhang	2673fa29d4	fix: set runtime path (#2466 )	2024-12-12 18:05:48 +08:00
Yineng Zhang	56fcd8e8a5	feat: support sgl-kernel PyPI (#2433 ) Co-authored-by: Zhangyi <1109276519@qq.com>	2024-12-11 06:06:19 +08:00
Yineng Zhang	47eb139f81	feat: use warp reduce as a simple example (#2304 )	2024-12-01 22:43:50 +08:00
Yineng Zhang	5c91a315d7	feat: support sgl-kernel pypi (#2302 )	2024-12-01 20:11:21 +08:00

32 Commits