sglang

Author	SHA1	Message	Date
Ke Bao	57ab776910	Fuse sorted_token_ids padding to moe_align_block_size kernel (#7437 )	2025-06-24 17:44:27 -07:00
Yuan Luo	84727a5139	[sgl-kernel] Add cuda kernel for moe_ep_silu_and_mul (#6919 ) Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>	2025-06-11 20:43:08 -07:00
Elfie Guo	3e56f557fd	Add a CUDA kernel for fusing mapping and weighted sum for MoE. (#6916 ) Co-authored-by: Elfie Guo <elfiegxf@gmail.com>	2025-06-07 15:24:39 -07:00
Xiaoyu Zhang	8b5f83ed3b	reduce torch.zeros overhead in moe align block size kernel (#6369 )	2025-06-07 02:47:36 -07:00
Yuan Luo	43baba649e	[EP] Add cuda kernel for moe_ep_post_reorder (#6837 ) Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>	2025-06-05 00:33:47 -07:00
Cheng Wan	81964328b7	Set `num_fused_shared_experts` as `num_shared_experts` when shared_experts fusion is not disabled (#6736 )	2025-06-04 15:53:22 -07:00
Xiaoyu Zhang	bd75690f4e	fix ep_moe_reorder kernel bugs (#6858 ) Co-authored-by: JieXin Liang <Alcanderian@users.noreply.github.com>	2025-06-04 19:13:59 +08:00
Cheng Wan	8a5480528d	[Refactor] Rename `n_share_experts_fusion` as `num_fused_shared_experts` (#6735 )	2025-06-03 17:48:24 -07:00
Pavani Majety	eb38c7d1ca	[1/2] Add Kernel support for Cutlass based Fused FP4 MoE (#6093 ) Signed-off-by: Pavani Majety <pmajety@nvidia.com>	2025-06-02 13:48:03 -07:00
Yuan Luo	55444ed667	[EP] Add cuda kernel for moe_ep_pre_reorder (#6699 ) Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>	2025-06-01 20:49:01 -07:00
Elfie Guo	6fc9357503	[2/2] Add python wrapper for CUTLASS FP8 Blockscale MoE Kernel. (#5694 )	2025-05-16 13:14:07 -07:00
Elfie Guo	e62c49557d	[1/2] Add FP8 Blockscale MoE CUTLASS kernel for Blackwell (#5281 )	2025-04-22 22:28:20 -07:00
Xiaoyu Zhang	8e09b37077	Sgl kernel fused_moe_gate support n_shared_experts (#5440 )	2025-04-17 23:05:15 -07:00
Xiaoyu Zhang	f730362ee2	reduce moe_align_block_size_kernel small batch mode overhead (#5086 )	2025-04-09 17:59:35 -07:00
Qingquan Song	45dcfc2e76	Add deepseek style fused moe group gate selection kernel (#4530 )	2025-03-29 11:51:45 -07:00
Yineng Zhang	8bf6d7f406	support cmake for sgl-kernel (#4706 ) Co-authored-by: hebiao064 <hebiaobuaa@gmail.com> Co-authored-by: yinfan98 <1106310035@qq.com>	2025-03-27 01:42:28 -07:00
Qingquan Song	61e4433caf	Add moe topk softmax templated from vllm (#4302 )	2025-03-14 12:03:33 -07:00
Shi Shuai	817d43705c	feat: support ep size < 32 for sgl kernel (#4348 )	2025-03-12 20:50:46 -07:00
Lianmin Zheng	8abf74e3c9	Rename files in sgl kernel to avoid nested folder structure (#4213 ) Co-authored-by: zhyncs <me@zhyncs.com>	2025-03-08 22:54:51 -08:00

19 Commits