sglang

Author	SHA1	Message	Date
Lzhang-hub	4efe2c57c9	support vlm model spec bench (#10173 )	2025-09-10 13:37:04 +08:00
Kay Yan	975a5ec69c	[fix] update bench_speculative.py for compatibility (#7764 ) Signed-off-by: Kay Yan <kay.yan@daocloud.io>	2025-07-04 16:32:54 +08:00
Yineng Zhang	7282ab741a	fix: update bench_speculative (#5649 )	2025-04-22 16:08:15 -07:00
Baizhou Zhang	6fb29ffd9e	Deprecate enable-flashinfer-mla and enable-flashmla (#5480 )	2025-04-17 01:43:33 -07:00
lukec	a53fe428f9	Support FlashMLA backend (#4472 ) Co-authored-by: yinfan98 <1106310035@qq.com>	2025-03-16 09:07:06 -07:00
Ke Bao	f1d09a6541	Update bench speculative script (#4235 )	2025-03-09 12:19:01 -07:00
Lianmin Zheng	935cda944b	Misc clean up; Remove the support of jump forward (#4032 )	2025-03-03 07:02:14 -08:00
Lianmin Zheng	ac2387279e	Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988 ) Co-authored-by: SangBin Cho <rkooo567@gmail.com> Co-authored-by: dhou-xai <dhou@x.ai> Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>	2025-03-03 00:12:04 -08:00