Commit Graph

12 Commits

Author SHA1 Message Date
Cheng Wan
a5f5ab4030 update sgl-kernel for EP: kernel part (#8514)
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Ke Bao <ispobaoke@gmail.com>
2025-07-30 22:19:55 -07:00
Ke Bao
5973675bc3 Fix moe align kernel test (#8531) 2025-07-29 11:03:02 -07:00
Ke Bao
57ab776910 Fuse sorted_token_ids padding to moe_align_block_size kernel (#7437) 2025-06-24 17:44:27 -07:00
Xiaoyu Zhang
f730362ee2 reduce moe_align_block_size_kernel small batch mode overhead (#5086) 2025-04-09 17:59:35 -07:00
Xiaoyu Zhang
924ca7c92c Add DeepSeek V3/R1 shared experts fusion (#4918) 2025-04-04 01:59:29 -07:00
lukec
b93ef5e56d Remove the vllm dependency from the moe_align function (#4164)
Co-authored-by: Hongbosherlock <hongbosherlock@gmail.com>
2025-03-07 22:42:16 -08:00
Chayenne
18bb216c28 Revert "[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x)" (#3982) 2025-02-28 23:57:17 -08:00
yiakwy-xpu-ml-framework-team
1c96fa86cf [MOE] enable efficient moe_alignment multi-blocks execution (3x~6x) (#3613) 2025-02-27 19:42:48 -08:00
Xiaoyu Zhang
ad3499858e clean moe align block kernel code and add acc test (#3332) 2025-02-06 16:42:36 +08:00
Xiaoyu Zhang
d08c77c434 Sampling penalties memory interface (#2870) 2025-01-13 23:09:00 +08:00
HandH1998
77d1210b36 fix moe_align_block_size (#2615) 2024-12-27 23:32:53 +08:00
Yineng Zhang
31548116a8 fix moe_align_block_size_kernel for shared memory issue (#2579)
Co-authored-by: ispobock <ispobaoke@163.com>
2024-12-26 05:31:04 +08:00