Commit Graph

39 Commits

Author SHA1 Message Date
yichuan~
49c5e0eca9 Add support for OpenAI API parallel sampling (#640) 2024-07-19 23:10:01 -07:00
shrirajh
1b7adbb5a0 TokenizerManager.context_len should inherit from `server_args.conte… (#654) 2024-07-18 21:55:29 -07:00
Mingyi
d774acad5c Remove the dependency of rpyc (#646) 2024-07-18 02:13:54 -07:00
zhyncs
2e341cd493 misc: add pre-commit config (#637) 2024-07-17 11:55:39 -07:00
胡译文
02b7258658 [Feat] Expose logprob options to sgl.gen API (#503)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2024-07-09 00:35:39 -07:00
Liangsheng Yin
f25b76c02a add LogitsMetadata (#604) 2024-07-08 17:46:55 -07:00
Liangsheng Yin
0877f1e75b Fix streaming (#600) 2024-07-07 01:55:58 -07:00
Pan Lyu
26908d9568 * fix(detokenizer_manager.py): fix truncated decoded output (#586)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
2024-07-06 14:53:22 -07:00
Ying Sheng
dc1b8bcfaa Format (#593) 2024-07-05 10:06:17 -07:00
sglang
11616fc6bd Minor fix in compiler & format (#545) 2024-06-29 23:42:14 -07:00
Ying Sheng
fb9296f0ed Higher priority for user input of max_prefill_tokens & format (#540) 2024-06-12 21:48:40 -07:00
Lianmin Zheng
f6dbd24043 Improve doc strings (#518) 2024-06-08 02:39:32 -07:00
Qubitium
f70f72586a Fix rid state map leak + Refractor .finished (#505)
Co-authored-by: ZX <zx@lbx.dev>
2024-06-07 13:20:40 -07:00
Lianmin Zheng
8dbdc018a3 Abort disconnected requests (#457) 2024-05-20 18:41:21 -07:00
LiviaSun
ec380dfd30 openai chat speculative execution (#250)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-05-18 22:23:53 -07:00
Lianmin Zheng
8210ec60f4 Improve error handling & abort disconnected requests (#449) 2024-05-17 05:49:31 -07:00
Lianmin Zheng
c05956e534 Simplify port allocation (#447) 2024-05-16 18:07:30 -07:00
Liangsheng Yin
690d162d97 Format code (#441) 2024-05-14 22:40:46 +08:00
Yuanhan Zhang
0992d85f92 support llava video (#426) 2024-05-13 16:57:00 -07:00
Lianmin Zheng
5dc55a5f02 Handle truncation errors (#436) 2024-05-13 15:56:00 -07:00
Shannon Shen
04c0b21488 Allow input_ids in the input of the /generate endpoint (#363) 2024-05-12 15:29:00 -07:00
Lianmin Zheng
3fc97f6709 Move openai api server into a separate file (#429) 2024-05-12 06:41:32 -07:00
Lianmin Zheng
aee4f523cf Fix logit processor bugs (#427) 2024-05-12 04:54:07 -07:00
Liangsheng Yin
9acc6e3504 add .isort.cfg (#378) 2024-04-22 22:38:09 +08:00
Liangsheng Yin
3842eba5fa Logprobs Refractor (#331) 2024-03-28 14:34:49 +08:00
Qubitium
ce216c80dc Cleanup codebase: removed unnecessary code/logic (#298) 2024-03-23 10:15:16 -07:00
Liangsheng Yin
1b35547927 Organize server_args (#277) 2024-03-11 20:06:52 +08:00
Cody Yu
a7334aeea1 Support decode token logprobs (#130) 2024-02-06 12:24:55 -08:00
Keith Stevens
1d0fbe8e43 [Feature] Adds basic support for image content in OpenAI chat routes (#113) 2024-01-30 06:12:33 -08:00
Lianmin Zheng
6f560c761b Improve the control of streaming and improve the first token latency in streaming (#117) 2024-01-29 17:05:42 -08:00
Liangsheng Yin
81561f8e2d Flush Cache API (#103) 2024-01-25 21:32:59 -08:00
parasol-aser
23950056f0 support speculative execution for openai API (#48)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-01-25 01:57:06 -08:00
Liangsheng Yin
01ee0fbc05 fast regex decode
Auto-detect constant str path in regex FSM, then extend instead.
2024-01-25 01:16:25 +08:00
Lianmin Zheng
c70b3cfa9e Bump the version to v0.1.8 (#93) 2024-01-24 03:33:34 -08:00
Ying Sheng
489796c7ea minor performance fix 2024-01-24 10:45:44 +00:00
Lianmin Zheng
bef0b35902 Fix llava & Fix multiprocessing 2024-01-24 10:35:31 +00:00
shiyi.c_98
c6576e820c Llava-hd Support (#92)
Co-authored-by: Haotian Liu <liuhaotian.cn@gmail.com>
2024-01-24 01:51:21 -08:00
Lianmin Zheng
9a16fea012 Return logprob for choices (#87) 2024-01-23 05:07:30 -08:00
Lianmin Zheng
22085081bb release initial code
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com>
Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
2024-01-08 04:37:50 +00:00