fzyzcjy
|
e85cb1ce9d
|
Fix quant kernel test errors and benchmark wrong output speeds (#7604)
|
2025-08-21 03:48:41 -07:00 |
|
fzyzcjy
|
e34cf6ad75
|
Fix bench script making input data on L2 cache (#7739)
|
2025-07-27 00:30:24 -07:00 |
|
applesaucethebun
|
2ce8793519
|
Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-11 12:55:00 +08:00 |
|
Zhaoyi Li
|
3c9740d200
|
update variable naming and comments for rocm (#5299)
|
2025-04-11 23:15:05 -07:00 |
|
Xiaoyu Zhang
|
2c8fd99363
|
[sgl-kernel] per token group quant support COLUMN MAJOR (#4817)
|
2025-04-02 18:29:59 -07:00 |
|
Chunan Zeng
|
65c24c28f9
|
[Quant Kernel] refactored per token group quant fp8 to support int8 up-to 2x faster (#4396)
|
2025-03-23 23:44:17 -07:00 |
|