Commit Graph

7 Commits

Author SHA1 Message Date
Stefan He
db7343c992 fix per token cuda kernel hidden dim cannot divide by 16 (#8543) 2025-08-01 09:27:18 -07:00
Zhaoyi Li
3c9740d200 update variable naming and comments for rocm (#5299) 2025-04-11 23:15:05 -07:00
yinfan98
d2e507df3c [Misc] clean up vllm in sgl-kernel test (#5189) 2025-04-09 01:22:13 -07:00
Adarsh Shirawalmath
9fccda3111 [Feature] use pytest for sgl-kernel (#4896) 2025-03-30 10:36:52 -07:00
Yineng Zhang
2937387a50 fix accuracy issue (#4376) 2025-03-13 02:06:22 -07:00
Qingquan Song
4068e01292 Fix per token fp8 quant precision (#4362) 2025-03-12 21:19:05 -07:00
Stefan He
63ee26d162 Add sgl_per_token_quant_fp8 (#4089) 2025-03-06 20:53:05 -08:00