sglang

Author	SHA1	Message	Date
Qiaolin Yu	0b9557fcd7	Disable compiling arch below sm_90 in aarch64 by default (#6380 )	2025-05-27 15:50:02 -07:00
HandH1998	4d643f6c7a	[1/2] Support Qserve (#6457 ) Co-authored-by: yych0745 <1398089567@qq.com> Co-authored-by: sleepcoo <sleepcoo@gmail.com>	2025-05-21 19:48:59 -07:00
Elfie Guo	6fc9357503	[2/2] Add python wrapper for CUTLASS FP8 Blockscale MoE Kernel. (#5694 )	2025-05-16 13:14:07 -07:00
Elfie Guo	c23a7072b6	Upgrade CUTLASS 4.0 (#6336 ) Co-authored-by: zhyncs <me@zhyncs.com>	2025-05-15 17:42:23 -07:00
Yineng Zhang	213e8c7dd5	chore: upgrade deepgemm (#6073 )	2025-05-11 02:17:24 -07:00
applesaucethebun	2ce8793519	Add typo checker in pre-commit (#6179 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-11 12:55:00 +08:00
Yineng Zhang	6f56614734	chore: upgrade cutlass 3.9.2 (#6004 ) Co-authored-by: yizhang2077 <1109276519@qq.com>	2025-05-06 13:34:08 -07:00
Johnny	9f21e75453	add Thor & Spark (#5915 )	2025-04-30 19:43:40 -07:00
PGFLMG	08acdb5c3d	[Feat] Scale up fa3 kernel to sm8x arch (#5912 ) Co-authored-by: zhyncs <me@zhyncs.com>	2025-04-30 13:59:36 -07:00
zhjunqin	403b855a22	Add sm_120 for blackwell (#5903 )	2025-04-29 20:45:24 -07:00
Xiaoyu Zhang	5bb0accbcf	cutlass 3.9 supported to improve fp8_blockwise_gemm (#5820 )	2025-04-28 21:52:36 -07:00
PGFLMG	ee71ed8a41	[Feat] QWen-1M context support[1/2]: Update block sparse attention backend utils kernel (#5847 ) Co-authored-by: sighingnow <sighingnow@gmail.com>	2025-04-28 11:03:17 -07:00
Yineng Zhang	15fabcc07f	fix sgl-kernel unit tests (#5666 )	2025-04-23 01:18:30 -07:00
Elfie Guo	e62c49557d	[1/2] Add FP8 Blockscale MoE CUTLASS kernel for Blackwell (#5281 )	2025-04-22 22:28:20 -07:00
PGFLMG	c08a717c77	[Feat] Update sgl-kernel flashinfer to latest main version (#5500 ) Co-authored-by: zhyncs <me@zhyncs.com>	2025-04-17 12:43:23 -07:00
Elfie Guo	85ec0440a5	Update cutlass dependency. (#5447 )	2025-04-15 23:28:04 -07:00
Lianmin Zheng	838fa0f218	[minor] cleanup cmakelists.txt (#5420 )	2025-04-15 07:07:07 -07:00
DefTruth	388e15c0db	kernel: support slightly faster merge_state_v2 cuda kernel (#5381 )	2025-04-14 21:28:23 -07:00
Yineng Zhang	6c41fcf0e4	chore: upgrade DeepGEMM (#5395 )	2025-04-14 20:32:46 -07:00
Lianmin Zheng	dae7944440	minor clean up of sgl-kernel/CMakeLists.txt (#5393 )	2025-04-14 18:38:44 -07:00
Yineng Zhang	b62e7e99b8	feat: adapt merge_state (#5337 )	2025-04-12 21:14:04 -07:00
PGFLMG	4879e50c6d	[Feat] Add sparse attn to sgl-kernel (#5327 )	2025-04-12 11:36:36 -07:00
Trevor Morris	f65b8d5c89	Blackwell Cutlass MLA kernel (#5142 )	2025-04-11 22:16:51 -07:00
Yineng Zhang	136b8e6afb	fix: remove cublas_grouped_gemm (#5307 )	2025-04-11 16:22:37 -07:00
Yineng Zhang	7074e9ca20	fix: enable fp4 compilation on cu128 (#5286 )	2025-04-11 01:43:44 -07:00
Yi Zhang	bcbbf519f9	sgl-kernel transfer custom allreduce from trt kernel to vllm kernel (#5079 )	2025-04-05 14:23:20 -07:00
yinfan98	b8b6008f47	[Fix] fix fa3 build at cu118 (#5036 )	2025-04-03 11:52:35 -07:00
Zhiqiang Xie	9d0b36c47a	fix deepgemm as well (#5030 )	2025-04-03 02:41:37 -07:00
Yuhong Guo	7d8c0ce7ce	[Build] Support build sgl-kernel with ccache (#5020 )	2025-04-03 00:22:37 -07:00
Zhiqiang Xie	a2aea59b6e	update cutlass tag (#5011 )	2025-04-02 18:30:30 -07:00
yinfan98	37c66ec856	[feat] add fa3 in sgl-kernel (#4902 ) Co-authored-by: Sleepcoo <Sleepcoo@gmail.com>	2025-03-30 12:57:10 -07:00
Qingquan Song	45dcfc2e76	Add deepseek style fused moe group gate selection kernel (#4530 )	2025-03-29 11:51:45 -07:00
Yineng Zhang	92941ce7b5	bump sgl-kernel 0.0.5.post4 (#4768 )	2025-03-28 14:40:53 -07:00
Yineng Zhang	2bb0e7cf43	fix sampling issue (#4871 )	2025-03-28 14:07:21 -07:00
yinfan98	4db29e82ec	[Feat] support deepgemm for cmake (#4864 )	2025-03-28 10:51:44 -07:00
Yineng Zhang	8bf6d7f406	support cmake for sgl-kernel (#4706 ) Co-authored-by: hebiao064 <hebiaobuaa@gmail.com> Co-authored-by: yinfan98 <1106310035@qq.com>	2025-03-27 01:42:28 -07:00
Yineng Zhang	7596417732	minor: use bear for compilation database (#2919 )	2025-01-16 18:39:11 +08:00
Xiaoyu Zhang	f005758f2b	introduce CUB in sgl-kernel (#2887 )	2025-01-14 19:48:59 +08:00
Xiaoyu Zhang	e2b16c4716	add sampling_scaling_penalties kernel (#2846 )	2025-01-12 19:38:17 -08:00
Ke Bao	0f3eb1d294	Support cutlass Int8 gemm (#2752 )	2025-01-06 22:51:22 +08:00
Yineng Zhang	b6b57fc200	minor: cleanup sgl-kernel (#2679 )	2024-12-31 14:52:00 +08:00
Ke Bao	b4403985d0	Add cutlass submodule for sgl-kernel (#2676 )	2024-12-31 14:28:29 +08:00
Ke Bao	b02da24a5b	Refactor sgl-kernel build (#2642 )	2024-12-30 18:07:01 +08:00
yizhang2077	e04d3f2897	adapt tensorrt llm custom all reduce to sgl-kernel (#2481 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-12-15 13:15:59 +08:00
Yineng Zhang	7301a39b13	fix: resolve CodeQL cpp issue (#2305 )	2024-12-01 23:55:19 +08:00

45 Commits