Lianmin Zheng
|
c17c578108
|
Simplify tokenizer manager (#1904)
|
2024-11-03 08:38:26 -08:00 |
|
Gleb Drozdov
|
a95d5589c3
|
Add matched_stop token or str to distinguish between eos or stop str finish_reason generation (#1684)
|
2024-10-17 18:06:52 +00:00 |
|
Michael Feil
|
b0facb3316
|
add orjson for jsonresponse (#1688)
|
2024-10-16 18:14:30 -07:00 |
|
havetc
|
ecb8bad276
|
Returning a per request metric for number of cached_tokens read (#1599)
|
2024-10-16 11:49:22 -07:00 |
|
Ying Sheng
|
2725f8da61
|
[Minor] Rename no_eos_trim to no_stop_trim (#1661)
|
2024-10-13 20:30:03 -07:00 |
|
Ying Sheng
|
4876117171
|
[Fix] fix eos trim inconsistency (#1650)
|
2024-10-13 01:07:09 -07:00 |
|
Lianmin Zheng
|
e37cdab0c6
|
Fix ignore_eos (#1645)
|
2024-10-12 00:36:28 -07:00 |
|
LI MOU
|
1d9deeacdb
|
fix missing ignore_eos in v1/chat/completions (#1642)
|
2024-10-11 21:37:20 -07:00 |
|
Lianmin Zheng
|
f13d86f920
|
Add image_token in conversation.py (#1632)
Co-authored-by: yizhang2077 <1109276519@qq.com>
|
2024-10-11 05:07:51 -07:00 |
|
Yiding-Lu
|
b503881bd2
|
[Bug] Fix the Image Input of Batch Generation (#1579)
|
2024-10-11 02:25:04 -07:00 |
|
Ying Sheng
|
e4780cf839
|
[API, Feature] Support response prefill for openai API (#1490)
|
2024-09-22 06:46:17 -07:00 |
|
Lianmin Zheng
|
9ba1f09760
|
[Fix] Fix logprob and normalized_logprob (#1428)
|
2024-09-15 06:36:06 -07:00 |
|
Lianmin Zheng
|
b912de11b0
|
Make stop reason a dict instead of str (#1407)
|
2024-09-12 20:47:31 -07:00 |
|
zifeitong
|
9144ed1067
|
Support OpenAI API json_schema response format (#1363)
|
2024-09-09 19:08:25 -07:00 |
|
Kaichen Zhang - NTU
|
662ecd9368
|
[Feat] Add modalities for vision server when handling pixel values for llava (#1346)
|
2024-09-09 02:07:34 -07:00 |
|
Christopher Chou
|
51c554d812
|
Allow more flexible assistant and system response (#1256)
|
2024-08-30 11:51:44 -07:00 |
|
caiyueliang
|
2f1d92834f
|
[FEAT] Support batches cancel (#1222)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-08-26 23:28:26 +00:00 |
|
havetc
|
9935f97b3e
|
[FEAT] JSON constrained support (#1125)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-08-26 09:37:26 -07:00 |
|
Lianmin Zheng
|
902278008a
|
[Minor] Improve the function organization in TokenizerManager & improve loggers (#1208)
|
2024-08-25 14:46:34 -07:00 |
|
Lianmin Zheng
|
a8ae640328
|
Improve docs and warnings (#1164)
|
2024-08-20 08:31:29 -07:00 |
|
Juwan Yoo
|
d8476818ef
|
feat: allow streaming for multi-prompt and/or parallel sampling (#1134)
|
2024-08-20 08:06:55 -07:00 |
|
yichuan~
|
b997a18d74
|
[Feat]Add support for optional start len of logprobs (#1035)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
|
2024-08-18 23:45:41 -07:00 |
|
Ying Sheng
|
6767e2229f
|
Support jinja as chat template file (#1104)
|
2024-08-14 17:43:14 -07:00 |
|
Lianmin Zheng
|
d84c5e70f7
|
Test the case when max_new_tokens is very large (#1038)
|
2024-08-11 16:41:03 -07:00 |
|
Ying Sheng
|
b16e856f11
|
Add openai embedding API (#997)
|
2024-08-09 11:19:18 -07:00 |
|
Juwan Yoo
|
ab7875941b
|
feat: frequency, min_new_tokens, presence, and repetition penalties (#973)
|
2024-08-08 04:21:08 -07:00 |
|
yichuan~
|
3a79613c28
|
support more optioin about usage in stream mode (#985)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-08-08 09:41:57 +00:00 |
|
Ying Sheng
|
20a4f927dc
|
Add io struct for embedding models [unreachable code] - step 2/3 (#987)
|
2024-08-08 07:52:31 +00:00 |
|
yichuan~
|
795eab6dda
|
Add support for Batch API test (#936)
|
2024-08-06 23:52:10 -07:00 |
|
yichuan~
|
fd7926e46e
|
Fix prompt len in parallel sampling (#928)
|
2024-08-05 00:56:08 -07:00 |
|
yichuan~
|
d53dcf9c98
|
Support more OpenAI API test (#916)
|
2024-08-04 16:43:09 -07:00 |
|
Ying Sheng
|
fbd6b94d69
|
Fix the double BOS problem in the HF chat template (#888)
|
2024-08-02 00:30:50 -07:00 |
|
yichuan~
|
ca600e8cd6
|
Add support for logprobs in OpenAI chat API (#852)
|
2024-08-01 00:08:21 -07:00 |
|
yichuan~
|
bb0501c0d9
|
Fix List input bug (#838)
|
2024-07-30 13:40:51 -07:00 |
|
yichuan~
|
084fa54d37
|
Add support for OpenAI API : offline batch(file) processing (#699)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
|
2024-07-29 13:07:18 -07:00 |
|
Ying Sheng
|
8d908a937c
|
Fix echo + lobprob for OpenAI API when the prompt is a list (#791)
|
2024-07-28 17:09:16 -07:00 |
|
Yineng Zhang
|
dd7e8b9421
|
chore: add copyright for srt (#790)
|
2024-07-28 23:07:12 +10:00 |
|
Lianmin Zheng
|
30db99b3d9
|
Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776)
|
2024-07-27 19:50:34 -07:00 |
|
Lianmin Zheng
|
f95e661757
|
Fix max_tokens for OpenAI chat completion API (#766)
|
2024-07-27 15:44:27 -07:00 |
|
Toshiki Kataoka
|
40facad5f1
|
feat: support token ids in /v1/completions (#736)
|
2024-07-26 02:53:17 -07:00 |
|
Mingyi
|
e3046ea3a8
|
Update OpenAI API (#667)
|
2024-07-19 23:20:54 -07:00 |
|