yinfan98
|
b8b6008f47
|
[Fix] fix fa3 build at cu118 (#5036)
|
2025-04-03 11:52:35 -07:00 |
|
yinfan98
|
37c66ec856
|
[feat] add fa3 in sgl-kernel (#4902)
Co-authored-by: Sleepcoo <Sleepcoo@gmail.com>
|
2025-03-30 12:57:10 -07:00 |
|
Qingquan Song
|
45dcfc2e76
|
Add deepseek style fused moe group gate selection kernel (#4530)
|
2025-03-29 11:51:45 -07:00 |
|
Yineng Zhang
|
8bf6d7f406
|
support cmake for sgl-kernel (#4706)
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
|
2025-03-27 01:42:28 -07:00 |
|
Trevor Morris
|
e9f8e42318
|
Support FP4 gemm (1/2) (#3899)
|
2025-03-24 19:50:23 -07:00 |
|
Chunan Zeng
|
65c24c28f9
|
[Quant Kernel] refactored per token group quant fp8 to support int8 up-to 2x faster (#4396)
|
2025-03-23 23:44:17 -07:00 |
|
Ying Sheng
|
52a34d7448
|
Add greedy verification kernel (#4383)
|
2025-03-16 00:58:26 -07:00 |
|
Qingquan Song
|
61e4433caf
|
Add moe topk softmax templated from vllm (#4302)
|
2025-03-14 12:03:33 -07:00 |
|
Rex
|
07f944631e
|
Add awq dequantize kernel to sgl with 1x to 3x speedup (#4104)
|
2025-03-12 00:10:02 -07:00 |
|
Lianmin Zheng
|
730d084f2a
|
Minor style fix for sgl-kernel (#4243)
|
2025-03-09 20:15:13 -07:00 |
|
Lianmin Zheng
|
eb06dbcbf8
|
Move rope and bmm into sgl-kernel (#4241)
|
2025-03-09 18:38:15 -07:00 |
|
Lianmin Zheng
|
8abf74e3c9
|
Rename files in sgl kernel to avoid nested folder structure (#4213)
Co-authored-by: zhyncs <me@zhyncs.com>
|
2025-03-08 22:54:51 -08:00 |
|