sglang

Author	SHA1	Message	Date
Xiaoyu Zhang	8e09b37077	Sgl kernel fused_moe_gate support n_shared_experts (#5440 )	2025-04-17 23:05:15 -07:00
PGFLMG	c08a717c77	[Feat] Update sgl-kernel flashinfer to latest main version (#5500 ) Co-authored-by: zhyncs <me@zhyncs.com>	2025-04-17 12:43:23 -07:00
DefTruth	388e15c0db	kernel: support slightly faster merge_state_v2 cuda kernel (#5381 )	2025-04-14 21:28:23 -07:00
Yineng Zhang	b62e7e99b8	feat: adapt merge_state (#5337 )	2025-04-12 21:14:04 -07:00
PGFLMG	4879e50c6d	[Feat] Add sparse attn to sgl-kernel (#5327 )	2025-04-12 11:36:36 -07:00
Trevor Morris	f65b8d5c89	Blackwell Cutlass MLA kernel (#5142 )	2025-04-11 22:16:51 -07:00
Yineng Zhang	136b8e6afb	fix: remove cublas_grouped_gemm (#5307 )	2025-04-11 16:22:37 -07:00
Richard Zou	76f44c2a8d	Fix deepseek-v3 with torch.compile in PyTorch 2.6. (#5213 )	2025-04-10 09:14:38 -07:00
Yi Zhang	bcbbf519f9	sgl-kernel transfer custom allreduce from trt kernel to vllm kernel (#5079 )	2025-04-05 14:23:20 -07:00
yinfan98	b8b6008f47	[Fix] fix fa3 build at cu118 (#5036 )	2025-04-03 11:52:35 -07:00