Yineng Zhang
|
4814ecaff9
|
cleanup sgl-kernel (#4933)
|
2025-03-30 14:12:30 -07:00 |
|
Yineng Zhang
|
8bf6d7f406
|
support cmake for sgl-kernel (#4706)
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
|
2025-03-27 01:42:28 -07:00 |
|
strgrb
|
f9c53cbb42
|
Create col-major and tma-aligned x_scale for deep_gemm.gemm_fp8_fp8_bf16_nt (#4515)
Co-authored-by: Zhang Kaihong <zhangkaihong.zkh@alibaba-inc.com>
|
2025-03-19 00:02:43 -07:00 |
|
Yineng Zhang
|
bde24ab31f
|
update deepgemm (#4284)
|
2025-03-10 23:39:57 -07:00 |
|
Elfie Guo
|
bf2eefc0c7
|
Uupdate cutalss dependency for its bug fix (#4277)
|
2025-03-10 17:00:05 -07:00 |
|
laixin
|
c553e1604c
|
DeepGemm integrate to sgl-kernel (#4165)
Co-authored-by: sleepcoo <sleepcoo@gmail.com>
Co-authored-by: HandH1998 <1335248067@qq.com>
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2025-03-10 00:35:07 -07:00 |
|
Yineng Zhang
|
df84ab2a5b
|
update sgl-kernel 3rdparty (#4228)
|
2025-03-09 01:16:05 -08:00 |
|
Elfie Guo
|
9e74ee91da
|
Update cutlass dependency (#3966)
|
2025-02-28 16:16:31 -08:00 |
|
Yineng Zhang
|
7876279ea7
|
update cutlass dependency (#3240)
|
2025-02-01 03:13:44 +08:00 |
|
Yineng Zhang
|
3ee62235c6
|
revert the MoE dependence (#3230)
|
2025-01-31 16:51:41 +08:00 |
|
Yineng Zhang
|
9602c2aac7
|
keep the parts needed for moe_kernels (#3218)
|
2025-01-31 00:39:47 +08:00 |
|
Yineng Zhang
|
e81d7f11de
|
add tensorrt_llm moe_gemm as 3rdparty (#3217)
|
2025-01-30 23:49:14 +08:00 |
|
Yineng Zhang
|
222ce6f1da
|
add tensorrt_llm common and cutlass_extensions as 3rdparty (#3216)
Co-authored-by: BBuf <35585791+BBuf@users.noreply.github.com>
|
2025-01-30 23:04:41 +08:00 |
|
Yineng Zhang
|
c38b5fb4f4
|
update 3rdparty and rms norm for sgl-kernel (#3213)
|
2025-01-30 19:32:21 +08:00 |
|
Byron Hsu
|
fb11a43981
|
[kernel] Integrate flashinfer's rope with higher precision and better perf (#3134)
|
2025-01-27 15:28:00 +08:00 |
|
Yineng Zhang
|
14e754a868
|
chore: bump v0.0.2.post17 for sgl-kernel (#3125)
|
2025-01-25 20:43:02 +08:00 |
|
Yineng Zhang
|
153b414e83
|
minor: sync flashinfer and add turbomind as 3rdparty (#3105)
|
2025-01-24 19:22:39 +08:00 |
|
Yineng Zhang
|
0da0989ad4
|
sync flashinfer and update sgl-kernel tests (#3081)
|
2025-01-23 21:13:55 +08:00 |
|
Yineng Zhang
|
bcda0c9ee6
|
sync the upstream updates of flashinfer (#3051)
|
2025-01-22 20:33:13 +08:00 |
|
Yineng Zhang
|
5a0d680a14
|
feat: add flashinfer as 3rdparty and use rmsnorm as example (#3033)
|
2025-01-21 20:44:49 +08:00 |
|
Yineng Zhang
|
d33cbb7e58
|
remove cub and add cccl (#2976)
|
2025-01-19 15:51:27 +08:00 |
|
Yineng Zhang
|
e2cdc8a5b5
|
upgrade cutlass v3.7.0 (#2967)
|
2025-01-18 23:37:42 +08:00 |
|
Xiaoyu Zhang
|
f005758f2b
|
introduce CUB in sgl-kernel (#2887)
|
2025-01-14 19:48:59 +08:00 |
|
Ke Bao
|
b4403985d0
|
Add cutlass submodule for sgl-kernel (#2676)
|
2024-12-31 14:28:29 +08:00 |
|