Trevor Morris
|
f65b8d5c89
|
Blackwell Cutlass MLA kernel (#5142)
|
2025-04-11 22:16:51 -07:00 |
|
Yineng Zhang
|
136b8e6afb
|
fix: remove cublas_grouped_gemm (#5307)
|
2025-04-11 16:22:37 -07:00 |
|
Qingquan Song
|
45dcfc2e76
|
Add deepseek style fused moe group gate selection kernel (#4530)
|
2025-03-29 11:51:45 -07:00 |
|
Trevor Morris
|
e9f8e42318
|
Support FP4 gemm (1/2) (#3899)
|
2025-03-24 19:50:23 -07:00 |
|
Chunan Zeng
|
65c24c28f9
|
[Quant Kernel] refactored per token group quant fp8 to support int8 up-to 2x faster (#4396)
|
2025-03-23 23:44:17 -07:00 |
|
Ying Sheng
|
52a34d7448
|
Add greedy verification kernel (#4383)
|
2025-03-16 00:58:26 -07:00 |
|
Qingquan Song
|
61e4433caf
|
Add moe topk softmax templated from vllm (#4302)
|
2025-03-14 12:03:33 -07:00 |
|
Rex
|
07f944631e
|
Add awq dequantize kernel to sgl with 1x to 3x speedup (#4104)
|
2025-03-12 00:10:02 -07:00 |
|
Lianmin Zheng
|
8abf74e3c9
|
Rename files in sgl kernel to avoid nested folder structure (#4213)
Co-authored-by: zhyncs <me@zhyncs.com>
|
2025-03-08 22:54:51 -08:00 |
|