This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX-Hygon
/
sglang
Watch
5
Star
0
Fork
0
You've already forked sglang
Code
Issues
Pull Requests
Actions
7
Projects
Releases
Wiki
Activity
Files
87dab548243bf319d2a24fdf251738d0dcc7700a
sglang
/
sgl-kernel
/
csrc
/
moe
History
Peng Zhang
5aa1ebd242
[2/n]decouple quantization implementation from vLLM dependency (
#8112
)
...
Co-authored-by: walker-ai <
yiyun.wyt@antgroup.com
> Co-authored-by: leoneo <
1320612015@qq.com
>
2025-08-14 03:19:03 -07:00
..
cutlass_moe
/w4a8
[1/n]: add cutlass W4A8 moe kernel for hopper architecture (
#7772
)
2025-07-04 20:50:12 -07:00
marlin_moe_wna16
[2/n]decouple quantization implementation from vLLM dependency (
#8112
)
2025-08-14 03:19:03 -07:00
cutlass_moe_helper.cu
[Fix]Fix index oob in get_group_gemm_starts kernel. (
#8564
)
2025-07-30 19:49:35 -07:00
ep_moe_reorder_kernel.cu
[EP] Add cuda kernel for moe_ep_post_reorder (
#6837
)
2025-06-05 00:33:47 -07:00
ep_moe_silu_and_mul_kernel.cu
[sgl-kernel] Add cuda kernel for moe_ep_silu_and_mul (
#6919
)
2025-06-11 20:43:08 -07:00
fp8_blockwise_moe_kernel.cu
[Perf]Use Cooperative Schedule for H100 & H200 & H800 in fp8_blockwise_scaled_grouped_mm (
#8722
)
2025-08-02 21:13:47 -07:00
moe_align_kernel.cu
update sgl-kernel for EP: kernel part (
#8514
)
2025-07-30 22:19:55 -07:00
moe_fused_gate.cu
[1/2][resubmit again] sgl-kernel: Fuse routed scaling factor into moe_fused_gate (
#9088
)
2025-08-12 20:12:38 -07:00
moe_topk_softmax_kernels.cu
[optimize] fuse renormalize into moe_topk_softmax (
#7744
)
2025-07-03 12:42:44 -07:00
nvfp4_blockwise_moe.cu
[1/2] Add Kernel support for Cutlass based Fused FP4 MoE (
#6093
)
2025-06-02 13:48:03 -07:00
prepare_moe_input.cu
fix: fix apply_shuffle_mul_sum (
#7444
)
2025-07-04 23:23:30 -07:00