sglang

Author	SHA1	Message	Date
Lianmin Zheng	d4fc1a70e3	Crash the server correctly during error (#2231 )	2024-11-28 00:22:39 -08:00
Lianmin Zheng	7d671e4ad2	Enable overlap by default (#2067 )	2024-11-19 22:07:58 -08:00
Lianmin Zheng	a7164b620f	Tune the threshold for accuracy tests in CI (#2071 )	2024-11-17 21:51:00 -08:00
Lianmin Zheng	86fc0d79d0	Add a watch dog thread (#1816 )	2024-10-27 02:00:50 -07:00
Lianmin Zheng	05b3bf5e8e	Crash the server on warnings in CI (#1772 )	2024-10-23 16:27:13 -07:00
Lianmin Zheng	9463bc1385	Enable torch.compile for triton backend (#1422 )	2024-09-14 15:38:37 -07:00
Jerry Zhang	a7c47e0f02	Add torchao quant (int4/int8/fp8) to llama models (#1341 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2024-09-09 05:32:41 -07:00
Lianmin Zheng	e4d68afcf0	[Minor] Many cleanup (#1357 )	2024-09-09 04:14:11 -07:00
Lianmin Zheng	843e63d809	Fix the flaky test test_moe_eval_accuracy_large.py (#1326 )	2024-09-04 04:15:11 -07:00
Lianmin Zheng	58fa607622	Fix the flaky tests in test_moe_eval_accuracy_large.py (#1293 )	2024-09-01 12:20:46 -07:00
Lianmin Zheng	1b5d56f7f8	[CI] Add more multi-gpu tests (#1280 )	2024-09-01 00:27:25 -07:00