sglang/moe at 03680f33be3e533eba9fe45daafb76d394e19dec - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

Qi Yuhang fda4792620 Update CUTLASS 4.2 & Enable K-Major Scale Factor for SM90 FP8 Blockwise Group GEMM (#9559 )

2025-08-24 23:24:43 -07:00

..

cutlass_moe/w4a8

[1/n]: add cutlass W4A8 moe kernel for hopper architecture (#7772 )

2025-07-04 20:50:12 -07:00

marlin_moe_wna16

[2/n]decouple quantization implementation from vLLM dependency (#8112 )

2025-08-14 03:19:03 -07:00

cutlass_moe_helper.cu

[Fix]Fix index oob in get_group_gemm_starts kernel. (#8564 )

2025-07-30 19:49:35 -07:00

ep_moe_reorder_kernel.cu

[EP] Add cuda kernel for moe_ep_post_reorder (#6837 )

2025-06-05 00:33:47 -07:00

ep_moe_silu_and_mul_kernel.cu

[sgl-kernel] Add cuda kernel for moe_ep_silu_and_mul (#6919 )

2025-06-11 20:43:08 -07:00

fp8_blockwise_moe_kernel.cu

Update CUTLASS 4.2 & Enable K-Major Scale Factor for SM90 FP8 Blockwise Group GEMM (#9559 )

2025-08-24 23:24:43 -07:00

moe_align_kernel.cu

[AMD] Reorganize hip-related header files in sgl-kernel (#9320 )

2025-08-18 16:53:44 -07:00

moe_fused_gate.cu

[1/2][resubmit again] sgl-kernel: Fuse routed scaling factor into moe_fused_gate (#9088 )

2025-08-12 20:12:38 -07:00

moe_topk_softmax_kernels.cu

[optimize] fuse renormalize into moe_topk_softmax (#7744 )

2025-07-03 12:42:44 -07:00

nvfp4_blockwise_moe.cu

[1/2] Add Kernel support for Cutlass based Fused FP4 MoE (#6093 )

2025-06-02 13:48:03 -07:00

prepare_moe_input.cu

fix: fix apply_shuffle_mul_sum (#7444 )

2025-07-04 23:23:30 -07:00