Commit Graph

5 Commits

Author SHA1 Message Date
hlu1
1e85589dc5 Make fp4_quantize kernels work on sm103 (#9807)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
2025-08-29 21:15:08 -07:00
Kaixi Hou
5c34b4f1c7 [NVIDIA] [2/N] Optimize silu_and_mul_scaled_fp4_grouped_quant perf (#9556) 2025-08-29 17:17:03 -07:00
Kaixi Hou
e5638573c1 [NVIDA] [1/N] Nvfp4 Masked Gemm: Add quant op for the flashinfer grouped gemm (#9200) 2025-08-22 12:19:45 -07:00
jy-song-hub
4fc09e0df0 Fp4 MOE quant kernel optimization (#8777)
Co-authored-by: Rain Jiang <96632942+rainj-me@users.noreply.github.com>
2025-08-15 01:46:16 -07:00
Pavani Majety
eb38c7d1ca [1/2] Add Kernel support for Cutlass based Fused FP4 MoE (#6093)
Signed-off-by: Pavani Majety <pmajety@nvidia.com>
2025-06-02 13:48:03 -07:00