sglang

Author	SHA1	Message	Date
Lianmin Zheng	fba8eccd7e	Log if cuda graph is used & extend cuda graph capture to cuda-graph-max-bs (#6201 ) Co-authored-by: SangBin Cho <rkooo567@gmail.com>	2025-05-12 00:17:33 -07:00
Ke Bao	ebaba85655	Update ci test and doc for MTP api change (#5952 )	2025-05-01 09:30:27 -07:00
Baizhou Zhang	efbae697b3	[Revision] Replace enable_flashinfer_mla argument with attention_backend (#5052 )	2025-04-05 01:23:02 -07:00
Lianmin Zheng	74885a848b	Revert "Replace enable_flashinfer_mla argument with attention_backend" (#5048 )	2025-04-03 13:30:56 -07:00
Baizhou Zhang	e8999b13b7	Replace enable_flashinfer_mla argument with attention_backend (#5005 )	2025-04-03 02:53:58 -07:00
Lianmin Zheng	b26bc86b36	Support page size > 1 + eagle (#4908 )	2025-03-30 00:46:23 -07:00
fzyzcjy	15ddd84322	Add retry for flaky tests in CI (#4755 )	2025-03-25 16:53:12 -07:00
Ying Sheng	1b859295f4	[Eagle] Remove the greedy branch and some redundant code (#4363 ) Co-authored-by: Sehoon Kim <sehoon@x.ai>	2025-03-16 02:48:55 -07:00
Lianmin Zheng	730d084f2a	Minor style fix for sgl-kernel (#4243 )	2025-03-09 20:15:13 -07:00
Baizhou Zhang	9dfafa743c	Fix test of flashinfer mla with nextn (#4237 )	2025-03-09 12:45:39 -07:00
Baizhou Zhang	9fb48f951f	Support nextn for flashinfer mla attention backend (#4218 )	2025-03-09 00:01:54 -08:00
Ke Bao	03b0364f76	Update nextn ci test (#4071 )	2025-03-04 13:01:24 -08:00
Baizhou Zhang	90a4b7d98a	[Feature]Support ragged prefill in flashinfer mla backend (#3967 ) Co-authored-by: Yineng Zhang <me@zhyncs.com> Co-authored-by: pankajroark <pankajroark@users.noreply.github.com>	2025-02-28 18:13:56 -08:00

13 Commits