hlu1
|
1e85589dc5
|
Make fp4_quantize kernels work on sm103 (#9807)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
|
2025-08-29 21:15:08 -07:00 |
|
Kaixi Hou
|
5c34b4f1c7
|
[NVIDIA] [2/N] Optimize silu_and_mul_scaled_fp4_grouped_quant perf (#9556)
|
2025-08-29 17:17:03 -07:00 |
|
Kaixi Hou
|
e5638573c1
|
[NVIDA] [1/N] Nvfp4 Masked Gemm: Add quant op for the flashinfer grouped gemm (#9200)
|
2025-08-22 12:19:45 -07:00 |
|
jy-song-hub
|
4fc09e0df0
|
Fp4 MOE quant kernel optimization (#8777)
Co-authored-by: Rain Jiang <96632942+rainj-me@users.noreply.github.com>
|
2025-08-15 01:46:16 -07:00 |
|
Pavani Majety
|
eb38c7d1ca
|
[1/2] Add Kernel support for Cutlass based Fused FP4 MoE (#6093)
Signed-off-by: Pavani Majety <pmajety@nvidia.com>
|
2025-06-02 13:48:03 -07:00 |
|