Commit Graph

10 Commits

Author SHA1 Message Date
Trevor Morris
0ab3f437ab Cutlass MLA: Disable split kv due to https://github.com/NVIDIA/cutlass/issues/2274 (#6101) 2025-05-08 18:44:30 -07:00
PGFLMG
f6f96b0521 [sgl-kernel] fix: fix cu118 compile error (#6123)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-05-08 14:26:51 -07:00
PGFLMG
ee71ed8a41 [Feat] QWen-1M context support[1/2]: Update block sparse attention backend utils kernel (#5847)
Co-authored-by: sighingnow <sighingnow@gmail.com>
2025-04-28 11:03:17 -07:00
DefTruth
12ef7e3bc3 bugfix: fix merge_state_v2 cuda graph (#5419) 2025-04-15 10:18:47 -07:00
DefTruth
388e15c0db kernel: support slightly faster merge_state_v2 cuda kernel (#5381) 2025-04-14 21:28:23 -07:00
Yineng Zhang
b62e7e99b8 feat: adapt merge_state (#5337) 2025-04-12 21:14:04 -07:00
Yineng Zhang
812e82f35e fix: solve cu118 issue for cutlass mla (#5331) 2025-04-12 12:51:09 -07:00
Trevor Morris
f65b8d5c89 Blackwell Cutlass MLA kernel (#5142) 2025-04-11 22:16:51 -07:00
Yineng Zhang
8bf6d7f406 support cmake for sgl-kernel (#4706)
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
2025-03-27 01:42:28 -07:00
Lianmin Zheng
8abf74e3c9 Rename files in sgl kernel to avoid nested folder structure (#4213)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-03-08 22:54:51 -08:00