Commit Graph

5 Commits

Author SHA1 Message Date
Trevor Morris
f65b8d5c89 Blackwell Cutlass MLA kernel (#5142) 2025-04-11 22:16:51 -07:00
Yineng Zhang
136b8e6afb fix: remove cublas_grouped_gemm (#5307) 2025-04-11 16:22:37 -07:00
Richard Zou
76f44c2a8d Fix deepseek-v3 with torch.compile in PyTorch 2.6. (#5213) 2025-04-10 09:14:38 -07:00
Yi Zhang
bcbbf519f9 sgl-kernel transfer custom allreduce from trt kernel to vllm kernel (#5079) 2025-04-05 14:23:20 -07:00
yinfan98
b8b6008f47 [Fix] fix fa3 build at cu118 (#5036) 2025-04-03 11:52:35 -07:00