Commit Graph

5 Commits

Author SHA1 Message Date
laixin
b0df5d240b Tuning Script for Feature DeepSeek V3/R1 INT8 Quantization (block-wise) (#3922)
Co-authored-by: sleepcoo <sleepcoo@gmail.com>
2025-02-27 10:59:46 +00:00
Xiaoyu Zhang
c38f3aed24 support multi-gpu block-gemm tuning (#3639) 2025-02-18 00:00:35 +08:00
yigex
fdf04a1426 [ROCm] Add ROCm tuning config to block gemm and Re-tune for AMD Radeon Graphics (#3418)
Co-authored-by: Bruce Xue <yigex@xilinx.com>
Co-authored-by: HAI <hixiao@gmail.com>
2025-02-10 23:55:04 -08:00
Yineng Zhang
7b020cca2d add tuning block wise fp8 (#3242)
Co-authored-by: HandH1998 <007aabbcc411@gmail.com>
2025-02-01 03:58:18 +08:00
Ke Bao
85b2e05770 Add int8 quant kernel (#2848) 2025-01-13 13:16:58 +08:00