Commit Graph

11 Commits

Author SHA1 Message Date
Xiaoyu Zhang
fa3592cfeb rebase h20 fused_moe config (#6966) 2025-06-08 05:01:34 -07:00
Cheng Wan
8a5480528d [Refactor] Rename n_share_experts_fusion as num_fused_shared_experts (#6735) 2025-06-03 17:48:24 -07:00
Xiaoyu Zhang
076103535c fix log_info_on_rank0 error when run benchmark (#6260) 2025-05-28 00:20:01 -07:00
Lianmin Zheng
e8e18dcdcc Revert "fix some typos" (#6244) 2025-05-12 12:53:26 -07:00
applesaucethebun
d738ab52f8 fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-13 01:42:38 +08:00
Xiaoyu Zhang
1cc326032d simplify fused_moe config logging (#5801) 2025-04-28 17:04:54 -07:00
Xiaoyu Zhang
e132cba2a8 fused moe triton tuning script support qwen3 (#5842) 2025-04-28 09:13:04 -07:00
lambert0312
61e7c4dd21 Add A800 shared experts fused MoE kernel tuning configs for DeepSeek V3/R1 (#5368) 2025-04-14 18:39:44 -07:00
Xiaoyu Zhang
3e4794aad8 refine fused_moe tuning docs (#5294) 2025-04-12 10:01:13 -07:00
Xiaoyu Zhang
3844feb9bb Add a unittest for fused_moe (#2416) 2024-12-08 22:46:10 -08:00
Xiaoyu Zhang
262e370f78 [benchmark] Add fused_moe_triton benchmark and tuning tools (#2225)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: HAI <hixiao@gmail.com>
2024-11-29 13:36:45 -08:00