sglang

Author	SHA1	Message	Date
Lianmin Zheng	9a16fea012	Return logprob for choices (#87 )	2024-01-23 05:07:30 -08:00
Lianmin Zheng	959c4174b2	Fix the chat template for QWen (#83 )	2024-01-22 21:46:47 -08:00
Lianmin Zheng	94e05770db	Fix after QWen support (#82 )	2024-01-22 21:17:05 -08:00
Arcmoon	63e97e5e4c	Suppport qwen model and solve some problems (#75 )	2024-01-22 20:14:51 -08:00
isaac-vidas	e08bca2840	Support load fine-tuned LLaVA model (#80 )	2024-01-22 18:15:48 -08:00
Ying Sheng	3f5c2f4c4a	Add an async example (#37 )	2024-01-21 15:17:30 -08:00
Lianmin Zheng	007eeb4eb9	Fix the error message and dependency of openai backend (#71 )	2024-01-21 14:56:25 -08:00
Lianmin Zheng	723f042163	release v0.1.7 & fix bugs	2024-01-21 10:31:02 +00:00
Lianmin Zheng	585eababa1	Improve error message of openai	2024-01-21 10:13:45 +00:00
Lianmin Zheng	cc3ada983f	Bump version to 0.1.6 (#68 )	2024-01-21 01:45:02 -08:00
Lianmin Zheng	a837166e6f	Fix select and normalized logprobs (#67 )	2024-01-21 01:39:23 -08:00
Lianmin Zheng	11f3cca64f	Fix select (#64 )	2024-01-20 23:20:35 -08:00
Liangsheng Yin	ca13f3b8c5	Disk FSM cache and adjust code. (#63 )	2024-01-20 21:26:11 -08:00
Lianmin Zheng	f30abd090a	Improve error message & Add vicuna template (#57 )	2024-01-19 17:03:33 -08:00
Liangsheng Yin	40ab1f0129	Fix the possible bug of decode out of memory (#36 )	2024-01-19 11:01:15 -08:00
Lianmin Zheng	199e82a15d	Format code & Improve readme (#52 )	2024-01-18 23:51:19 -08:00
Cody Yu	23471f9aa3	Support v1/chat/completions (#50 )	2024-01-18 23:43:09 -08:00
Cody Yu	61d4c93962	Support stream=True in v1/completions (#49 )	2024-01-18 17:00:56 -08:00
Lianmin Zheng	2b079f8931	Increase interpreter parallelism (#46 )	2024-01-18 13:30:10 -08:00
Lianmin Zheng	b240f75100	Add a parallel sampling case (#34 )	2024-01-18 06:29:43 +00:00
Lianmin Zheng	501f944445	Bump version to 0.1.5 (#33 )	2024-01-17 21:14:31 -08:00
Lianmin Zheng	22ec7bc2a1	Expose more arguments to control the scheduling policy (#32 )	2024-01-17 18:37:02 -08:00
Christopher Chou	c0454b323c	Add option to return metadata in async streaming (#18 )	2024-01-17 18:15:02 -08:00
Lianmin Zheng	8024fc5eec	Fix streaming (#30 )	2024-01-17 16:38:20 -08:00
Lianmin Zheng	f9d723816a	Teak mem fraction (#20 )	2024-01-17 04:43:17 -08:00
Lianmin Zheng	bf51ddc6e5	Improve docs & Rename Gemini -> VertexAI (#19 )	2024-01-17 02:54:41 -08:00
shiyi.c_98	fd7c479239	Gemini Backend (#9 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-01-16 22:29:37 -08:00
Lianmin Zheng	c4707f1bb5	Improve docs (#17 )	2024-01-16 19:53:55 -08:00
Ying Sheng	ffe4aaee1d	Fix for T4 GPUs (#16 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2024-01-16 15:49:03 -08:00
Christopher Chou	5b27a1dce4	Rename image_url to image_file (#15 )	2024-01-16 15:41:30 -08:00
Lianmin Zheng	2ccd9fd8c5	update version to 0.1.3	2024-01-16 05:55:25 +00:00
Lianmin Zheng	70359bf31a	Update benchmark scripts (#8 )	2024-01-15 16:12:57 -08:00
Liangsheng Yin	01ca82d765	fix radix cache match (#7 )	2024-01-15 09:42:46 -08:00
Lianmin Zheng	4bd8233f2c	Fix test cases (#6 )	2024-01-15 01:15:53 -08:00
Liangsheng Yin	08ab2a1655	Json Decode && Mutl-Turns (#4 )	2024-01-15 00:49:29 -08:00
hnyls2002	f652494df1	fix typo	2024-01-10 04:21:17 +00:00
Lianmin Zheng	30720e732c	Add install with pip (#3 )	2024-01-09 12:43:40 -08:00
Liangsheng Yin	331848de9d	Add SRT json decode example (#2 )	2024-01-09 12:35:44 -08:00
Lianmin Zheng	22085081bb	release initial code Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com> Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu> Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com> Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com> Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>	2024-01-08 04:37:50 +00:00

... 52 53 54 55 56

2789 Commits