Commit Graph

24 Commits

Author SHA1 Message Date
Yineng Zhang
4814ecaff9 cleanup sgl-kernel (#4933) 2025-03-30 14:12:30 -07:00
Yineng Zhang
8bf6d7f406 support cmake for sgl-kernel (#4706)
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
2025-03-27 01:42:28 -07:00
strgrb
f9c53cbb42 Create col-major and tma-aligned x_scale for deep_gemm.gemm_fp8_fp8_bf16_nt (#4515)
Co-authored-by: Zhang Kaihong <zhangkaihong.zkh@alibaba-inc.com>
2025-03-19 00:02:43 -07:00
Yineng Zhang
bde24ab31f update deepgemm (#4284) 2025-03-10 23:39:57 -07:00
Elfie Guo
bf2eefc0c7 Uupdate cutalss dependency for its bug fix (#4277) 2025-03-10 17:00:05 -07:00
laixin
c553e1604c DeepGemm integrate to sgl-kernel (#4165)
Co-authored-by: sleepcoo <sleepcoo@gmail.com>
Co-authored-by: HandH1998 <1335248067@qq.com>
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-03-10 00:35:07 -07:00
Yineng Zhang
df84ab2a5b update sgl-kernel 3rdparty (#4228) 2025-03-09 01:16:05 -08:00
Elfie Guo
9e74ee91da Update cutlass dependency (#3966) 2025-02-28 16:16:31 -08:00
Yineng Zhang
7876279ea7 update cutlass dependency (#3240) 2025-02-01 03:13:44 +08:00
Yineng Zhang
3ee62235c6 revert the MoE dependence (#3230) 2025-01-31 16:51:41 +08:00
Yineng Zhang
9602c2aac7 keep the parts needed for moe_kernels (#3218) 2025-01-31 00:39:47 +08:00
Yineng Zhang
e81d7f11de add tensorrt_llm moe_gemm as 3rdparty (#3217) 2025-01-30 23:49:14 +08:00
Yineng Zhang
222ce6f1da add tensorrt_llm common and cutlass_extensions as 3rdparty (#3216)
Co-authored-by: BBuf <35585791+BBuf@users.noreply.github.com>
2025-01-30 23:04:41 +08:00
Yineng Zhang
c38b5fb4f4 update 3rdparty and rms norm for sgl-kernel (#3213) 2025-01-30 19:32:21 +08:00
Byron Hsu
fb11a43981 [kernel] Integrate flashinfer's rope with higher precision and better perf (#3134) 2025-01-27 15:28:00 +08:00
Yineng Zhang
14e754a868 chore: bump v0.0.2.post17 for sgl-kernel (#3125) 2025-01-25 20:43:02 +08:00
Yineng Zhang
153b414e83 minor: sync flashinfer and add turbomind as 3rdparty (#3105) 2025-01-24 19:22:39 +08:00
Yineng Zhang
0da0989ad4 sync flashinfer and update sgl-kernel tests (#3081) 2025-01-23 21:13:55 +08:00
Yineng Zhang
bcda0c9ee6 sync the upstream updates of flashinfer (#3051) 2025-01-22 20:33:13 +08:00
Yineng Zhang
5a0d680a14 feat: add flashinfer as 3rdparty and use rmsnorm as example (#3033) 2025-01-21 20:44:49 +08:00
Yineng Zhang
d33cbb7e58 remove cub and add cccl (#2976) 2025-01-19 15:51:27 +08:00
Yineng Zhang
e2cdc8a5b5 upgrade cutlass v3.7.0 (#2967) 2025-01-18 23:37:42 +08:00
Xiaoyu Zhang
f005758f2b introduce CUB in sgl-kernel (#2887) 2025-01-14 19:48:59 +08:00
Ke Bao
b4403985d0 Add cutlass submodule for sgl-kernel (#2676) 2024-12-31 14:28:29 +08:00