Commit Graph

21 Commits

Author SHA1 Message Date
Ke Bao
5dfcacfcb1 Add compile flags for cutlass 3.x (#3013)
Co-authored-by: HandH1998 <1335248067@qq.com>
2025-01-21 00:04:12 +08:00
Byron Hsu
b5caa22dfb [kernel] port rope cuda kernel to sgl-kernel (#2993)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-01-20 20:58:51 +08:00
lukec
6f98c586bd fix sgl-kernel setup.py (#2963) 2025-01-18 18:50:37 +08:00
Yineng Zhang
2dc957d421 fix setup for sgl kernel (#2917) 2025-01-16 18:17:34 +08:00
Yineng Zhang
a53454c55e fix: sgl-kernel link cuda (#2906) 2025-01-16 04:53:23 +08:00
yizhang2077
6cb3974e77 optimize custom allreduce kernel (#2904) 2025-01-16 03:04:25 +08:00
Xiaoyu Zhang
e2b16c4716 add sampling_scaling_penalties kernel (#2846) 2025-01-12 19:38:17 -08:00
Ke Bao
0f3eb1d294 Support cutlass Int8 gemm (#2752) 2025-01-06 22:51:22 +08:00
Yineng Zhang
b6b57fc200 minor: cleanup sgl-kernel (#2679) 2024-12-31 14:52:00 +08:00
Ke Bao
b4403985d0 Add cutlass submodule for sgl-kernel (#2676) 2024-12-31 14:28:29 +08:00
Ke Bao
b02da24a5b Refactor sgl-kernel build (#2642) 2024-12-30 18:07:01 +08:00
Yineng Zhang
31548116a8 fix moe_align_block_size_kernel for shared memory issue (#2579)
Co-authored-by: ispobock <ispobaoke@163.com>
2024-12-26 05:31:04 +08:00
yizhang2077
e04d3f2897 adapt tensorrt llm custom all reduce to sgl-kernel (#2481)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-12-15 13:15:59 +08:00
Yineng Zhang
2673fa29d4 fix: set runtime path (#2466) 2024-12-12 18:05:48 +08:00
Yineng Zhang
dedaf8cd48 minor: update pypi tag (#2463) 2024-12-12 15:21:45 +08:00
Yineng Zhang
32ed016041 chore: bump v0.0.2 for sgl-kernel (#2462) 2024-12-12 14:58:05 +08:00
Yineng Zhang
7310aede97 fix: compatible with PEP 440 (#2435) 2024-12-11 06:48:45 +08:00
Yineng Zhang
5de9a58eca fix: use manylinux2014_x86_64 tag (#2434) 2024-12-11 06:17:41 +08:00
Yineng Zhang
56fcd8e8a5 feat: support sgl-kernel PyPI (#2433)
Co-authored-by: Zhangyi <1109276519@qq.com>
2024-12-11 06:06:19 +08:00
Yineng Zhang
28bc60dcab misc: update build setup (#2306) 2024-12-02 02:03:49 +08:00
Yineng Zhang
47eb139f81 feat: use warp reduce as a simple example (#2304) 2024-12-01 22:43:50 +08:00