sglang

Author	SHA1	Message	Date
Qi Yuhang	d9def43dcd	[Perf]Use Cooperative Schedule for H100 & H200 & H800 in fp8_blockwise_scaled_grouped_mm (#8722 )	2025-08-02 21:13:47 -07:00
Qi Yuhang	8e9fb43d82	Optimize Hopper CUTLASS FP8 Blockwise Grouped GEMM Kernel in Small K Scenario (#7782 )	2025-07-04 22:25:49 -07:00
ayrnb	2c4feaf308	Add CUTLASS FP8 Blockscale MoE kernel for Hopper architecture (#7278 ) Co-authored-by: HydraQYH <QYH820@Outlook.com> Co-authored-by: TianQiLin666666 <1834987979@qq.com>	2025-07-02 23:27:03 -07:00
Elfie Guo	3e56f557fd	Add a CUDA kernel for fusing mapping and weighted sum for MoE. (#6916 ) Co-authored-by: Elfie Guo <elfiegxf@gmail.com>	2025-06-07 15:24:39 -07:00
Elfie Guo	6fc9357503	[2/2] Add python wrapper for CUTLASS FP8 Blockscale MoE Kernel. (#5694 )	2025-05-16 13:14:07 -07:00
Elfie Guo	e62c49557d	[1/2] Add FP8 Blockscale MoE CUTLASS kernel for Blackwell (#5281 )	2025-04-22 22:28:20 -07:00