Commit Graph

10 Commits

Author SHA1 Message Date
HAI
30828e7192 AMD: set weights and scaling numbers properly for block FP8 (#2637) 2024-12-29 03:23:39 -08:00
Xiaoyu Zhang
9254a33ad4 avoid fused_moe_triton padding circular import (#2624) 2024-12-28 14:01:35 +08:00
HandH1998
53aed988cb Refactor MoE (#2575)
Co-authored-by: zhyncs <me@zhyncs.com>
2024-12-26 00:02:14 +08:00
Ke Bao
e835a50021 Reorg moe code (#2563) 2024-12-24 01:10:22 +08:00
HAI
95f93f493a Fp8 MoE optimizations on AMD (#2388) 2024-12-07 21:18:26 +08:00
Yineng Zhang
d332aa3b0c fix: resolve fp8 moe issue (#2387) 2024-12-07 19:28:53 +08:00
Yineng Zhang
84d96b3ae5 Move FP8 to SGLang (#2370)
Co-authored-by: HaiShaw <hixiao@gmail.com>
2024-12-06 15:42:10 +08:00
Lianmin Zheng
fb1f28cbbb Clean up the comments and names under python/sglang/srt/layers (#1047) 2024-08-12 05:54:37 +00:00
Yineng Zhang
dd7e8b9421 chore: add copyright for srt (#790) 2024-07-28 23:07:12 +10:00
Ying Sheng
2d96da813e refactor model loader [unreachable code]: initial refactor (#655) 2024-07-19 09:27:06 -07:00