Commit Graph

4 Commits

Author SHA1 Message Date
Mick
c797322280 fix: fix apply_shuffle_mul_sum (#7444) 2025-07-04 23:23:30 -07:00
Elfie Guo
3e56f557fd Add a CUDA kernel for fusing mapping and weighted sum for MoE. (#6916)
Co-authored-by: Elfie Guo <elfiegxf@gmail.com>
2025-06-07 15:24:39 -07:00
Pavani Majety
eb38c7d1ca [1/2] Add Kernel support for Cutlass based Fused FP4 MoE (#6093)
Signed-off-by: Pavani Majety <pmajety@nvidia.com>
2025-06-02 13:48:03 -07:00
Elfie Guo
6fc9357503 [2/2] Add python wrapper for CUTLASS FP8 Blockscale MoE Kernel. (#5694) 2025-05-16 13:14:07 -07:00