Chunan Zeng
|
6a384d5c01
|
Speed up per token and per tensor quant by 15% (#4639)
|
2025-03-22 00:37:57 -07:00 |
|
Yineng Zhang
|
2937387a50
|
fix accuracy issue (#4376)
|
2025-03-13 02:06:22 -07:00 |
|
Qingquan Song
|
4068e01292
|
Fix per token fp8 quant precision (#4362)
|
2025-03-12 21:19:05 -07:00 |
|
Stefan He
|
e0917e6bd0
|
Remove vllm ops scaled fp8 quant and accelerate per token quant by 20-28% (#4215)
Co-authored-by: Stefan He <bhe@linkedin.com>
|
2025-03-12 00:08:03 -07:00 |
|
Lianmin Zheng
|
8abf74e3c9
|
Rename files in sgl kernel to avoid nested folder structure (#4213)
Co-authored-by: zhyncs <me@zhyncs.com>
|
2025-03-08 22:54:51 -08:00 |
|