sglang

Author	SHA1	Message	Date
Lifu Huang	3cf1473a09	Use monotonic clock for interval measurement (#6211 ) Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>	2025-05-17 16:49:18 -07:00
applesaucethebun	2ce8793519	Add typo checker in pre-commit (#6179 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-11 12:55:00 +08:00
SEPLOS	032f8faaab	Fix sglang frontend's incorrect dependency on torch (#4931 )	2025-03-30 13:00:24 -07:00
mlmz	f6ab4ca6bc	fix: fix ipython running error for Engine due to outlines nest_asyncio (#4582 ) Co-authored-by: shuaills <shishuaiuoe@gmail.com>	2025-03-21 19:11:15 -07:00
Zhiqiang Xie	9376ac361d	Memory pool fix for upstream change about eagle (#4170 )	2025-03-07 00:58:20 -08:00
Yueyang Pan	25482edb5c	Online serving benchmarks of real datasets for hierarchical KV caching (#3211 ) Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>	2025-03-05 16:16:43 -08:00
Shi Shuai	55de40f782	[Docs]: Fix Multi-User Port Allocation Conflicts (#3601 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com> Co-authored-by: simveit <simp.veitner@gmail.com>	2025-02-19 11:15:44 -08:00
Jiada Li	39416e394a	fix lockfile and port_registry file permission error (#3598 ) Co-authored-by: jiada li <jiada@lmsys.us-northcentral1-a.compute.internal> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-02-15 19:14:45 -08:00
Shi Shuai	7443197a63	[CI] Improve Docs CI Efficiency (#3587 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-02-14 19:57:00 -08:00
Jhin	7b9b4f4426	Docs fix about EAGLE and streaming output (#3166 ) Co-authored-by: Chayenne <zhaochenyang@ucla.edu> Co-authored-by: Chayenne <zhaochen20@outlook.com> Co-authored-by: Jhin <jhinpan@umich.edu>	2025-01-27 18:10:45 -08:00
Lianmin Zheng	73401fd016	Sync distributed package from vllm 0.6.4.post1 (#3010 )	2025-01-20 04:57:14 -08:00
fzyzcjy	81d27c8e31	Refactor to add TypeBasedDispatcher to simplify dispatching (#2958 )	2025-01-18 20:13:27 -08:00
SangBin Cho	9208618b3e	[Core] in batch prefix caching by delay scheduling (#2442 )	2024-12-11 12:51:50 -08:00
Yineng Zhang	75ae968959	minor: update killall script (#2391 )	2024-12-08 04:21:00 +08:00
Lianmin Zheng	d4fc1a70e3	Crash the server correctly during error (#2231 )	2024-11-28 00:22:39 -08:00
Chayenne	c77c1e05ba	fix black in pre-commit (#1940 )	2024-11-08 07:42:47 +08:00
Iñaki Arango	1363b51983	Escape backwards slash (#1902 )	2024-11-03 12:27:11 -08:00
geeker-smallwhite	8ce202a493	delete unused character (#1855 )	2024-10-31 19:33:55 +08:00
Lianmin Zheng	b548801ddb	Update docs (#1839 )	2024-10-30 02:49:08 -07:00
Chayenne	539df95d2c	Imporve openai api documents (#1827 ) Co-authored-by: Chayenne <zhaochenyang@g.ucla.edu>	2024-10-30 00:39:41 -07:00
Chayenne	ced362f7c6	Simplify our docs with complicated functions into utils (#1807 ) Co-authored-by: Chayenne <zhaochenyang@ucla.edu>	2024-10-26 17:44:11 +00:00
Lianmin Zheng	e4d68afcf0	[Minor] Many cleanup (#1357 )	2024-09-09 04:14:11 -07:00
Lianmin Zheng	1e495e0847	[Fix] Fix select by ensuring each request has at least one token (#1318 )	2024-09-03 06:31:45 -07:00
Ying Sheng	9f662501a3	Move torch.compile configs into cuda_graph_runner.py (#993 )	2024-08-08 13:20:30 -07:00
Ying Sheng	0d4f3a9fcd	Make API Key OpenAI-compatible (#917 )	2024-08-04 13:35:44 -07:00
Ying Sheng	995af5a54b	Improve the structure of CI (#911 )	2024-08-03 23:09:21 -07:00
Ying Sheng	79f816292e	Fix lazy import location (#795 )	2024-07-28 22:09:50 -07:00
Ying Sheng	fb9296f0ed	Higher priority for user input of max_prefill_tokens & format (#540 )	2024-06-12 21:48:40 -07:00
Lianmin Zheng	2cea6146d8	Improve logging & add logit cap (#471 )	2024-05-24 03:48:53 -07:00
Lianmin Zheng	19d2135cb8	Use model loader from vllm (#459 )	2024-05-21 09:13:37 -07:00
Lianmin Zheng	8210ec60f4	Improve error handling & abort disconnected requests (#449 )	2024-05-17 05:49:31 -07:00
Liangsheng Yin	690d162d97	Format code (#441 )	2024-05-14 22:40:46 +08:00
Yuanhan Zhang	0992d85f92	support llava video (#426 )	2024-05-13 16:57:00 -07:00
Lianmin Zheng	562b8857d8	Improve error handling (#433 )	2024-05-12 20:49:04 -07:00
Lianmin Zheng	13662fd533	Fix RuntimeEndpoint (#279 )	2024-03-11 05:24:24 -07:00
Alessio Dalla Piazza	d5ae2ebaa2	Add Support for API Key Authentication (#230 )	2024-03-11 05:16:10 -07:00
Lianmin Zheng	faba293a0d	Improve gemma and documentations (#278 )	2024-03-11 04:43:39 -07:00
Srinivas Billa	01b07ea3ac	Add SSL Cert Functionality (#224 )	2024-03-03 17:41:41 +08:00
Lianmin Zheng	c51020cf0c	Fix the chat template for llava-v1.6-34b & format code (#177 )	2024-02-11 05:50:13 -08:00
Ying Sheng	a6aa46dd3f	minor	2024-02-08 04:35:25 +00:00
Srinivas Billa	405f26b00b	Add Auth Token to RuntimeEndpoint (#162 )	2024-02-07 20:07:31 -08:00
Haotian Liu	d3fc86a43e	Improve Chinese character streaming when the last char is half Chinese word. (#95 )	2024-01-24 12:23:27 -08:00
Liangsheng Yin	08ab2a1655	Json Decode && Mutl-Turns (#4 )	2024-01-15 00:49:29 -08:00
Lianmin Zheng	22085081bb	release initial code Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com> Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu> Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com> Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>	2024-01-08 04:37:50 +00:00

44 Commits