sglang

Author	SHA1	Message	Date
hlu1	7a16db9bd9	Make sm100 fp8 kernels available on sm103 (#9789 ) Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>	2025-08-28 23:47:29 -07:00
Qi Yuhang	fda4792620	Update CUTLASS 4.2 & Enable K-Major Scale Factor for SM90 FP8 Blockwise Group GEMM (#9559 )	2025-08-24 23:24:43 -07:00
kousakawang	5fd311d33e	[code clean] add H20 cutlass groupGemm default config (#9333 ) Co-authored-by: wanghanpei <wanghanpei@bytedance.com>	2025-08-21 19:23:29 -07:00
kousakawang	0fc54b971e	[fix]: fix cutlass moe ut and and Opt H20 cutlass groupGemm performance (#9272 ) Co-authored-by: wanghanpei <wanghanpei@bytedance.com>	2025-08-17 13:09:49 -07:00
Qi Yuhang	d9def43dcd	[Perf]Use Cooperative Schedule for H100 & H200 & H800 in fp8_blockwise_scaled_grouped_mm (#8722 )	2025-08-02 21:13:47 -07:00
Qi Yuhang	8e9fb43d82	Optimize Hopper CUTLASS FP8 Blockwise Grouped GEMM Kernel in Small K Scenario (#7782 )	2025-07-04 22:25:49 -07:00
ayrnb	2c4feaf308	Add CUTLASS FP8 Blockscale MoE kernel for Hopper architecture (#7278 ) Co-authored-by: HydraQYH <QYH820@Outlook.com> Co-authored-by: TianQiLin666666 <1834987979@qq.com>	2025-07-02 23:27:03 -07:00
Elfie Guo	3e56f557fd	Add a CUDA kernel for fusing mapping and weighted sum for MoE. (#6916 ) Co-authored-by: Elfie Guo <elfiegxf@gmail.com>	2025-06-07 15:24:39 -07:00
Elfie Guo	6fc9357503	[2/2] Add python wrapper for CUTLASS FP8 Blockscale MoE Kernel. (#5694 )	2025-05-16 13:14:07 -07:00
Elfie Guo	e62c49557d	[1/2] Add FP8 Blockscale MoE CUTLASS kernel for Blackwell (#5281 )	2025-04-22 22:28:20 -07:00

10 Commits