sglang

Author	SHA1	Message	Date
Enrique Shockwave	cf9d8efdd3	llama3 instruct template (#372 )	2024-04-21 09:40:12 -07:00
Liangsheng Yin	1bf1cf1953	Reduce overhead when `fork(1)` (#375 )	2024-04-21 17:25:14 +08:00
SimoneRaponi	ff99c38a07	Add timeout to get_meta_info (#346 ) Co-authored-by: simone <simone.raponi@equixely.com>	2024-04-03 22:22:06 +08:00
Junlong Li	cb389c91bc	Fix llava parallelism/fork bug (#315 )	2024-03-28 19:24:54 -07:00
Liangsheng Yin	2af565b3bb	[model] DBRX-instruct support (#337 )	2024-03-28 10:05:19 -07:00
Liangsheng Yin	3842eba5fa	Logprobs Refractor (#331 )	2024-03-28 14:34:49 +08:00
Jani Monoses	e57f079275	Use Anthropic messages API (#304 )	2024-03-22 13:23:31 -07:00
Liangsheng Yin	89885b31ef	Gemma Support (#256 )	2024-03-11 12:14:27 +08:00
Lin Tianchuan	30d67b2bca	Add `set_var` to interpreter.py (#263 )	2024-03-07 23:20:11 +08:00
Xinwei Xiong	b0b722ee8e	Refactor ChatTemplate for Enhanced Clarity and Efficiency (#201 )	2024-03-03 17:52:36 +08:00
Enrique Shockwave	9759d927cf	fix chatml template (#195 )	2024-02-24 16:34:22 +08:00
Zhang Wenbin	8d0a7fae3b	Fix interpreter.py `get_var(var_name)` in text iter when `stream` is not enabled (#198 )	2024-02-24 16:27:34 +08:00
Liangsheng Yin	c4e9ebe3a4	Fix stop str merging (#225 ) Co-authored-by: Enrique Shockwave <33002121+qeternity@users.noreply.github.com>	2024-02-24 16:05:21 +08:00
Lianmin Zheng	c51020cf0c	Fix the chat template for llava-v1.6-34b & format code (#177 )	2024-02-11 05:50:13 -08:00
Lianmin Zheng	23f05005fd	Format code & move functions (#155 )	2024-02-06 13:27:46 -08:00
Ying Sheng	67be11c790	fix bug of race condition in copy()	2024-02-03 01:38:00 -08:00
Christopher Chou	864425300f	Yi-VL Model (#112 )	2024-02-01 08:33:22 -08:00
Lianmin Zheng	0617528632	Update quick start examples (#120 )	2024-01-30 04:29:32 -08:00
parasol-aser	23950056f0	support speculative execution for openai API (#48 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-01-25 01:57:06 -08:00
Liangsheng Yin	01ee0fbc05	fast regex decode Auto-detect constant str path in regex FSM, then extend instead.	2024-01-25 01:16:25 +08:00
Lianmin Zheng	7358fa64f7	Fix a bug in runtime backend	2024-01-23 22:10:17 +00:00
Lianmin Zheng	9a16fea012	Return logprob for choices (#87 )	2024-01-23 05:07:30 -08:00
Lianmin Zheng	959c4174b2	Fix the chat template for QWen (#83 )	2024-01-22 21:46:47 -08:00
Lianmin Zheng	94e05770db	Fix after QWen support (#82 )	2024-01-22 21:17:05 -08:00
Lianmin Zheng	007eeb4eb9	Fix the error message and dependency of openai backend (#71 )	2024-01-21 14:56:25 -08:00
Lianmin Zheng	723f042163	release v0.1.7 & fix bugs	2024-01-21 10:31:02 +00:00
Liangsheng Yin	40ab1f0129	Fix the possible bug of decode out of memory (#36 )	2024-01-19 11:01:15 -08:00
Lianmin Zheng	2b079f8931	Increase interpreter parallelism (#46 )	2024-01-18 13:30:10 -08:00
Lianmin Zheng	22ec7bc2a1	Expose more arguments to control the scheduling policy (#32 )	2024-01-17 18:37:02 -08:00
Christopher Chou	c0454b323c	Add option to return metadata in async streaming (#18 )	2024-01-17 18:15:02 -08:00
Lianmin Zheng	bf51ddc6e5	Improve docs & Rename Gemini -> VertexAI (#19 )	2024-01-17 02:54:41 -08:00
shiyi.c_98	fd7c479239	Gemini Backend (#9 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-01-16 22:29:37 -08:00
Lianmin Zheng	c4707f1bb5	Improve docs (#17 )	2024-01-16 19:53:55 -08:00
Lianmin Zheng	4bd8233f2c	Fix test cases (#6 )	2024-01-15 01:15:53 -08:00
Liangsheng Yin	08ab2a1655	Json Decode && Mutl-Turns (#4 )	2024-01-15 00:49:29 -08:00
Liangsheng Yin	331848de9d	Add SRT json decode example (#2 )	2024-01-09 12:35:44 -08:00
Lianmin Zheng	22085081bb	release initial code Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com> Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu> Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com> Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>	2024-01-08 04:37:50 +00:00

1 2 3

137 Commits