This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX-Hygon
/
sglang
Watch
5
Star
0
Fork
0
You've already forked sglang
Code
Issues
Pull Requests
Actions
7
Projects
Releases
Wiki
Activity
Files
ac49dac009463ec4d88110ee74cb0ca551b997fb
sglang
/
sgl-kernel
/
csrc
History
ayrnb
2c4feaf308
Add CUTLASS FP8 Blockscale MoE kernel for Hopper architecture (
#7278
)
...
Co-authored-by: HydraQYH <
QYH820@Outlook.com
> Co-authored-by: TianQiLin666666 <
1834987979@qq.com
>
2025-07-02 23:27:03 -07:00
..
allreduce
support 1 shot allreduce in 1-node and 2-node using mscclpp (
#6277
)
2025-06-04 22:11:24 -07:00
attention
[fix] fix cutlass_mla_backend with cuda_graph and add sm_scale for sgl-kernel cutlass_mla (
#7184
)
2025-06-14 12:45:41 -07:00
cpu
[CPU] [sgl-kernel] set dispatch key of initialize to CatchAll (
#7734
)
2025-07-02 22:39:24 -07:00
cutlass_extensions
sgl-kernel use cutlass latest version for fp8 blockwise gemm (
#5207
)
2025-04-09 11:47:04 -07:00
elementwise
[Feat] Update sgl-kernel flashinfer to latest main version (
#5500
)
2025-04-17 12:43:23 -07:00
gemm
Add dsv3 router gemm kernel (
#7627
)
2025-06-29 23:31:55 -07:00
grammar
[sgl-kernel] fix: fix cu118 compile error (
#6123
)
2025-05-08 14:26:51 -07:00
kvcacheio
kvcache io kernels and test case (
#7382
)
2025-06-23 11:58:59 -07:00
moe
Add CUTLASS FP8 Blockscale MoE kernel for Hopper architecture (
#7278
)
2025-07-02 23:27:03 -07:00
speculative
Fix sampling for speculative decoding & simplify kernels (
#7207
)
2025-06-16 03:28:30 -07:00
common_extension.cc
[1/n] apply wna16marlin kernel in moe weight only quantization (
#7683
)
2025-07-01 23:21:25 -07:00
flash_extension.cc
[Fix] fix fa3 build at cu118 (
#5036
)
2025-04-03 11:52:35 -07:00
torch_extension_rocm.cc
Fuse sorted_token_ids padding to moe_align_block_size kernel (
#7437
)
2025-06-24 17:44:27 -07:00