Commit Graph

24 Commits

Author SHA1 Message Date
Yineng Zhang
c38b5fb4f4 update 3rdparty and rms norm for sgl-kernel (#3213) 2025-01-30 19:32:21 +08:00
Yineng Zhang
8a96f74988 chore: bump 0.0.3 for sgl-kernel (#3178)
Co-authored-by: ispobock <ispobaoke@hotmail.com>
Co-authored-by: BBuf <35585791+BBuf@users.noreply.github.com>
Co-authored-by: HandH1998 <007aabbcc411@gmail.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: ByronHsu <byronhsu1230@gmail.com>
2025-01-27 20:29:28 +08:00
Byron Hsu
514f37c32b [kernel] Fix position ids in rope (#3173) 2025-01-27 17:09:51 +08:00
Byron Hsu
741fccd7bf Bump sgl kernel to 0.0.2.post19 (#3167) 2025-01-27 15:36:07 +08:00
Yineng Zhang
318260c0fa chore: bump 0.0.2.post18 for sgl-kernel (#3149) 2025-01-26 19:00:34 +08:00
Yineng Zhang
896c07441e update installation doc for sgl-kernel (#3129) 2025-01-26 00:00:13 +08:00
Yineng Zhang
14e754a868 chore: bump v0.0.2.post17 for sgl-kernel (#3125) 2025-01-25 20:43:02 +08:00
Yineng Zhang
54bac8af0b chore: bump sgl-kernel 0.0.2.post16 (#3087) 2025-01-24 01:57:48 +08:00
Yineng Zhang
1f6cf0d4b9 fix build error for sgl-kernel (#3078) 2025-01-23 19:16:35 +08:00
Lianmin Zheng
553f5a3ffe Remove torch dependency in sgl-kernel (#3074) 2025-01-23 17:23:37 +08:00
Byron Hsu
b5caa22dfb [kernel] port rope cuda kernel to sgl-kernel (#2993)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-01-20 20:58:51 +08:00
Yineng Zhang
a53454c55e fix: sgl-kernel link cuda (#2906) 2025-01-16 04:53:23 +08:00
yizhang2077
6cb3974e77 optimize custom allreduce kernel (#2904) 2025-01-16 03:04:25 +08:00
Xiaoyu Zhang
e2b16c4716 add sampling_scaling_penalties kernel (#2846) 2025-01-12 19:38:17 -08:00
yizhang2077
3900a94afe Support twoshot kernel (#2688) 2025-01-06 00:47:16 +08:00
HandH1998
77d1210b36 fix moe_align_block_size (#2615) 2024-12-27 23:32:53 +08:00
Yineng Zhang
2dccecf432 fix: only enable moe_align_block_size for now (#2590) 2024-12-26 16:56:59 +08:00
Yineng Zhang
d7c0e872b0 chore: bump 0.0.2.post8 for sgl-kernel (#2580) 2024-12-26 06:11:39 +08:00
Yineng Zhang
31548116a8 fix moe_align_block_size_kernel for shared memory issue (#2579)
Co-authored-by: ispobock <ispobaoke@163.com>
2024-12-26 05:31:04 +08:00
yizhang2077
e04d3f2897 adapt tensorrt llm custom all reduce to sgl-kernel (#2481)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-12-15 13:15:59 +08:00
Yineng Zhang
2673fa29d4 fix: set runtime path (#2466) 2024-12-12 18:05:48 +08:00
Yineng Zhang
56fcd8e8a5 feat: support sgl-kernel PyPI (#2433)
Co-authored-by: Zhangyi <1109276519@qq.com>
2024-12-11 06:06:19 +08:00
Yineng Zhang
47eb139f81 feat: use warp reduce as a simple example (#2304) 2024-12-01 22:43:50 +08:00
Yineng Zhang
5c91a315d7 feat: support sgl-kernel pypi (#2302) 2024-12-01 20:11:21 +08:00