sglang

Author	SHA1	Message	Date
Lianmin Zheng	c17c578108	Simplify tokenizer manager (#1904 )	2024-11-03 08:38:26 -08:00
Gleb Drozdov	a95d5589c3	Add matched_stop token or str to distinguish between eos or stop str finish_reason generation (#1684 )	2024-10-17 18:06:52 +00:00
Michael Feil	b0facb3316	add orjson for jsonresponse (#1688 )	2024-10-16 18:14:30 -07:00
havetc	ecb8bad276	Returning a per request metric for number of cached_tokens read (#1599 )	2024-10-16 11:49:22 -07:00
Ying Sheng	2725f8da61	[Minor] Rename no_eos_trim to no_stop_trim (#1661 )	2024-10-13 20:30:03 -07:00
Ying Sheng	4876117171	[Fix] fix eos trim inconsistency (#1650 )	2024-10-13 01:07:09 -07:00
Lianmin Zheng	e37cdab0c6	Fix ignore_eos (#1645 )	2024-10-12 00:36:28 -07:00
LI MOU	1d9deeacdb	fix missing ignore_eos in v1/chat/completions (#1642 )	2024-10-11 21:37:20 -07:00
Lianmin Zheng	f13d86f920	Add image_token in conversation.py (#1632 ) Co-authored-by: yizhang2077 <1109276519@qq.com>	2024-10-11 05:07:51 -07:00
Yiding-Lu	b503881bd2	[Bug] Fix the Image Input of Batch Generation (#1579 )	2024-10-11 02:25:04 -07:00
Ying Sheng	e4780cf839	[API, Feature] Support response prefill for openai API (#1490 )	2024-09-22 06:46:17 -07:00
Lianmin Zheng	9ba1f09760	[Fix] Fix logprob and normalized_logprob (#1428 )	2024-09-15 06:36:06 -07:00
Lianmin Zheng	b912de11b0	Make stop reason a dict instead of str (#1407 )	2024-09-12 20:47:31 -07:00
zifeitong	9144ed1067	Support OpenAI API json_schema response format (#1363 )	2024-09-09 19:08:25 -07:00
Kaichen Zhang - NTU	662ecd9368	[Feat] Add modalities for vision server when handling pixel values for llava (#1346 )	2024-09-09 02:07:34 -07:00
Christopher Chou	51c554d812	Allow more flexible assistant and system response (#1256 )	2024-08-30 11:51:44 -07:00
caiyueliang	2f1d92834f	[FEAT] Support batches cancel (#1222 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-08-26 23:28:26 +00:00
havetc	9935f97b3e	[FEAT] JSON constrained support (#1125 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-08-26 09:37:26 -07:00
Lianmin Zheng	902278008a	[Minor] Improve the function organization in TokenizerManager & improve loggers (#1208 )	2024-08-25 14:46:34 -07:00
Lianmin Zheng	a8ae640328	Improve docs and warnings (#1164 )	2024-08-20 08:31:29 -07:00
Juwan Yoo	d8476818ef	feat: allow streaming for multi-prompt and/or parallel sampling (#1134 )	2024-08-20 08:06:55 -07:00
yichuan~	b997a18d74	[Feat]Add support for optional start len of logprobs (#1035 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: Yineng Zhang <me@zhyncs.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>	2024-08-18 23:45:41 -07:00
Ying Sheng	6767e2229f	Support jinja as chat template file (#1104 )	2024-08-14 17:43:14 -07:00
Lianmin Zheng	d84c5e70f7	Test the case when max_new_tokens is very large (#1038 )	2024-08-11 16:41:03 -07:00
Ying Sheng	b16e856f11	Add openai embedding API (#997 )	2024-08-09 11:19:18 -07:00
Juwan Yoo	ab7875941b	feat: frequency, min_new_tokens, presence, and repetition penalties (#973 )	2024-08-08 04:21:08 -07:00
yichuan~	3a79613c28	support more optioin about usage in stream mode (#985 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-08-08 09:41:57 +00:00
Ying Sheng	20a4f927dc	Add io struct for embedding models [unreachable code] - step 2/3 (#987 )	2024-08-08 07:52:31 +00:00
yichuan~	795eab6dda	Add support for Batch API test (#936 )	2024-08-06 23:52:10 -07:00
yichuan~	fd7926e46e	Fix prompt len in parallel sampling (#928 )	2024-08-05 00:56:08 -07:00
yichuan~	d53dcf9c98	Support more OpenAI API test (#916 )	2024-08-04 16:43:09 -07:00
Ying Sheng	fbd6b94d69	Fix the double BOS problem in the HF chat template (#888 )	2024-08-02 00:30:50 -07:00
yichuan~	ca600e8cd6	Add support for logprobs in OpenAI chat API (#852 )	2024-08-01 00:08:21 -07:00
yichuan~	bb0501c0d9	Fix List input bug (#838 )	2024-07-30 13:40:51 -07:00
yichuan~	084fa54d37	Add support for OpenAI API : offline batch(file) processing (#699 ) Co-authored-by: hnyls2002 <hnyls2002@gmail.com>	2024-07-29 13:07:18 -07:00
Ying Sheng	8d908a937c	Fix echo + lobprob for OpenAI API when the prompt is a list (#791 )	2024-07-28 17:09:16 -07:00
Yineng Zhang	dd7e8b9421	chore: add copyright for srt (#790 )	2024-07-28 23:07:12 +10:00
Lianmin Zheng	30db99b3d9	Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776 )	2024-07-27 19:50:34 -07:00
Lianmin Zheng	f95e661757	Fix max_tokens for OpenAI chat completion API (#766 )	2024-07-27 15:44:27 -07:00
Toshiki Kataoka	40facad5f1	feat: support token ids in /v1/completions (#736 )	2024-07-26 02:53:17 -07:00
Mingyi	e3046ea3a8	Update OpenAI API (#667 )	2024-07-19 23:20:54 -07:00

41 Commits