549 Commits

Author SHA1 Message Date
Yineng Zhang
9f8f2c7f74 update norm cu (#3048) 2025-01-22 18:58:44 +08:00
Ke Bao
6fc37bd8ee Fix sgl-kernel compile for sm80 (#3046) 2025-01-22 16:49:08 +08:00
Ke Bao
0ac019f171 Support sm90 Int8 gemm (#3035) 2025-01-21 22:21:54 +08:00
Yineng Zhang
5a0d680a14 feat: add flashinfer as 3rdparty and use rmsnorm as example (#3033) 2025-01-21 20:44:49 +08:00
Yineng Zhang
ec1c21cdc4 upgrade torch version for sgl-kernel (#3026) 2025-01-21 14:32:08 +08:00
Yineng Zhang
6c856b4f3a minor: update Makefile for sgl-kernel (#3025) 2025-01-21 13:08:15 +08:00
Ke Bao
5dfcacfcb1 Add compile flags for cutlass 3.x (#3013)
Co-authored-by: HandH1998 <1335248067@qq.com>
2025-01-21 00:04:12 +08:00
Byron Hsu
b5caa22dfb [kernel] port rope cuda kernel to sgl-kernel (#2993)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-01-20 20:58:51 +08:00
Yineng Zhang
a69cb5cff7 cleanup unused header in sgl_kernel (#2986) 2025-01-20 00:44:49 +08:00
Yineng Zhang
d33cbb7e58 remove cub and add cccl (#2976) 2025-01-19 15:51:27 +08:00
Yineng Zhang
e2cdc8a5b5 upgrade cutlass v3.7.0 (#2967) 2025-01-18 23:37:42 +08:00
lukec
6f98c586bd fix sgl-kernel setup.py (#2963) 2025-01-18 18:50:37 +08:00
Yineng Zhang
7596417732 minor: use bear for compilation database (#2919) 2025-01-16 18:39:11 +08:00
Yineng Zhang
2dc957d421 fix setup for sgl kernel (#2917) 2025-01-16 18:17:34 +08:00
Yineng Zhang
b7f3fec13c minor: rename bench for sgl kernel (#2909) 2025-01-16 05:55:43 +08:00
Yineng Zhang
a53454c55e fix: sgl-kernel link cuda (#2906) 2025-01-16 04:53:23 +08:00
yizhang2077
6cb3974e77 optimize custom allreduce kernel (#2904) 2025-01-16 03:04:25 +08:00
Xiaoyu Zhang
f005758f2b introduce CUB in sgl-kernel (#2887) 2025-01-14 19:48:59 +08:00
Xiaoyu Zhang
d08c77c434 Sampling penalties memory interface (#2870) 2025-01-13 23:09:00 +08:00
Xiaoyu Zhang
e2b16c4716 add sampling_scaling_penalties kernel (#2846) 2025-01-12 19:38:17 -08:00
Ke Bao
58f9060efe Update int8 gemm config (#2774) 2025-01-07 19:47:37 +08:00
Ke Bao
0f3eb1d294 Support cutlass Int8 gemm (#2752) 2025-01-06 22:51:22 +08:00
Ke Bao
06dd2eab84 Remove unused var in moe_align_kernel (#2751) 2025-01-06 22:13:28 +08:00
Ke Bao
439f65809f Fix sgl-kernel cu118 compile issue (#2750) 2025-01-06 21:59:31 +08:00
yizhang2077
3900a94afe Support twoshot kernel (#2688) 2025-01-06 00:47:16 +08:00
Xiaoyu Zhang
ded9fcd09a improve moe_align_kernel for deepseek v3 (#2735) 2025-01-06 00:28:22 +08:00
Yineng Zhang
b6b57fc200 minor: cleanup sgl-kernel (#2679) 2024-12-31 14:52:00 +08:00
Ke Bao
b4403985d0 Add cutlass submodule for sgl-kernel (#2676) 2024-12-31 14:28:29 +08:00
Ke Bao
b02da24a5b Refactor sgl-kernel build (#2642) 2024-12-30 18:07:01 +08:00
HandH1998
77d1210b36 fix moe_align_block_size (#2615) 2024-12-27 23:32:53 +08:00
Lianmin Zheng
dc3bee4815 Fix test and benchmark scripts (#2598) 2024-12-26 07:56:26 -08:00
Yineng Zhang
2dccecf432 fix: only enable moe_align_block_size for now (#2590) 2024-12-26 16:56:59 +08:00
Yineng Zhang
d7c0e872b0 chore: bump 0.0.2.post8 for sgl-kernel (#2580) 2024-12-26 06:11:39 +08:00
Yineng Zhang
31548116a8 fix moe_align_block_size_kernel for shared memory issue (#2579)
Co-authored-by: ispobock <ispobaoke@163.com>
2024-12-26 05:31:04 +08:00
Yineng Zhang
e8dbdf75bc fix typo (#2487) 2024-12-15 13:44:55 +08:00
yizhang2077
e04d3f2897 adapt tensorrt llm custom all reduce to sgl-kernel (#2481)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-12-15 13:15:59 +08:00
Yineng Zhang
fccbfa3752 format: add clang-format for sgl-kernel (#2483) 2024-12-14 22:36:04 +08:00
Yineng Zhang
2673fa29d4 fix: set runtime path (#2466) 2024-12-12 18:05:48 +08:00
Yineng Zhang
dedaf8cd48 minor: update pypi tag (#2463) 2024-12-12 15:21:45 +08:00
Yineng Zhang
32ed016041 chore: bump v0.0.2 for sgl-kernel (#2462) 2024-12-12 14:58:05 +08:00
Yineng Zhang
7310aede97 fix: compatible with PEP 440 (#2435) 2024-12-11 06:48:45 +08:00
Yineng Zhang
5de9a58eca fix: use manylinux2014_x86_64 tag (#2434) 2024-12-11 06:17:41 +08:00
Yineng Zhang
56fcd8e8a5 feat: support sgl-kernel PyPI (#2433)
Co-authored-by: Zhangyi <1109276519@qq.com>
2024-12-11 06:06:19 +08:00
Yineng Zhang
28bc60dcab misc: update build setup (#2306) 2024-12-02 02:03:49 +08:00
Yineng Zhang
7301a39b13 fix: resolve CodeQL cpp issue (#2305) 2024-12-01 23:55:19 +08:00
Yineng Zhang
47eb139f81 feat: use warp reduce as a simple example (#2304) 2024-12-01 22:43:50 +08:00
Yineng Zhang
5c91a315d7 feat: support sgl-kernel pypi (#2302) 2024-12-01 20:11:21 +08:00
Lianmin Zheng
b53d6cbda3 Add new contributors so they can trigger CI automatically (#2269)
Co-authored-by: Qun Yang <qun.yang@intel.com>
Co-authored-by: zhengy001 <zhengy.gator@gmail.com>
Co-authored-by: HandH1998 <1335248067@qq.com>
Co-authored-by: xiaobo <xiaob.chen@outlook.com>
2024-11-29 16:37:52 -08:00
Yineng Zhang
419a57e771 minor: add sgl-kernel dir (#2261) 2024-11-30 02:27:35 +08:00