Qi Yuhang
|
0f04a5f428
|
Optimize cutlass int8 gemm kernel for large M on SM89 Ada GPU (#10714)
|
2025-09-21 17:04:27 -07:00 |
|
triple-mu
|
444013585d
|
Fix typos and unify size(s)/stride(s) API calls (#8799)
|
2025-08-08 00:18:08 -07:00 |
|
Yi Pan
|
45fdf1f7f3
|
Fix shared memory OOM on sm86 GPUs. (#4797)
|
2025-03-26 10:41:53 -07:00 |
|
Wenbo Yang
|
75b656488a
|
Support serving DeepSeek-R1-Channel-INT8 with 32 L40S. (#4418)
|
2025-03-17 00:03:43 -07:00 |
|
Lianmin Zheng
|
8abf74e3c9
|
Rename files in sgl kernel to avoid nested folder structure (#4213)
Co-authored-by: zhyncs <me@zhyncs.com>
|
2025-03-08 22:54:51 -08:00 |
|