Ying Sheng
|
51fda1439f
|
Update Readme (#660)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2024-07-19 09:54:01 -07:00 |
|
Ying Sheng
|
dc1b8bcfaa
|
Format (#593)
|
2024-07-05 10:06:17 -07:00 |
|
Lianmin Zheng
|
eb1ae6ae0c
|
Add sglang.bench_latency for offline benchmark (#564)
|
2024-06-25 03:38:04 -07:00 |
|
Liangsheng Yin
|
05471f2103
|
Update test_flashinfer (#560)
|
2024-06-24 15:23:57 +08:00 |
|
Lianmin Zheng
|
1fa15099d8
|
Add LlamaForClassification (#559)
|
2024-06-22 00:49:31 -07:00 |
|
Ying Sheng
|
fb9296f0ed
|
Higher priority for user input of max_prefill_tokens & format (#540)
|
2024-06-12 21:48:40 -07:00 |
|
胡译文
|
87260b7bfd
|
Litellm Backend (#502)
|
2024-06-07 12:24:28 -07:00 |
|
Ying Sheng
|
0463f7fb52
|
Support data parallelism (static) (#480)
Co-authored-by: Ying Sheng <ying.sheng@databricks.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2024-05-27 21:24:10 -07:00 |
|
Ying Sheng
|
3e684be7a3
|
Fix openai speculative execution (#456)
|
2024-05-20 17:01:13 -07:00 |
|
Liangsheng Yin
|
690d162d97
|
Format code (#441)
|
2024-05-14 22:40:46 +08:00 |
|
Lianmin Zheng
|
5dc55a5f02
|
Handle truncation errors (#436)
|
2024-05-13 15:56:00 -07:00 |
|
Lianmin Zheng
|
6e09cf6a15
|
Misc fixes (#432)
|
2024-05-12 15:05:40 -07:00 |
|
Lianmin Zheng
|
aee4f523cf
|
Fix logit processor bugs (#427)
|
2024-05-12 04:54:07 -07:00 |
|
Lianmin Zheng
|
7023f413c6
|
Clean up (#422)
|
2024-05-11 20:55:00 -07:00 |
|
Liangsheng Yin
|
19818b9c2f
|
Minor: style improvement of radix_cache and memory_pool (#395)
|
2024-04-26 01:01:36 +08:00 |
|
Liangsheng Yin
|
150d7020ed
|
Revert removing the unused imports (#385)
|
2024-04-23 22:36:33 +08:00 |
|
Liangsheng Yin
|
9acc6e3504
|
add .isort.cfg (#378)
|
2024-04-22 22:38:09 +08:00 |
|
Fronx
|
2b6d999191
|
Fix issue #367 – System message not supported for Anthropic (anthropic.BadRequestError) (#368)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-04-16 11:18:24 -07:00 |
|
Lianmin Zheng
|
65501a9cf1
|
Fix commandr import; format code
|
2024-04-16 18:10:12 +00:00 |
|
Liangsheng Yin
|
62b3812b69
|
Time cost utils (#355)
|
2024-04-09 23:27:31 +08:00 |
|
Liangsheng Yin
|
3842eba5fa
|
Logprobs Refractor (#331)
|
2024-03-28 14:34:49 +08:00 |
|
Jani Monoses
|
e57f079275
|
Use Anthropic messages API (#304)
|
2024-03-22 13:23:31 -07:00 |
|
Lianmin Zheng
|
4aa5dd2c5f
|
Update version to v0.1.13 (#280)
|
2024-03-11 05:49:27 -07:00 |
|
Liangsheng Yin
|
1b35547927
|
Organize server_args (#277)
|
2024-03-11 20:06:52 +08:00 |
|
Lianmin Zheng
|
faba293a0d
|
Improve gemma and documentations (#278)
|
2024-03-11 04:43:39 -07:00 |
|
Lianmin Zheng
|
c51020cf0c
|
Fix the chat template for llava-v1.6-34b & format code (#177)
|
2024-02-11 05:50:13 -08:00 |
|
Cody Yu
|
50afed4eaa
|
Support extra field regex in OpenAI API (#172)
|
2024-02-10 17:21:33 -08:00 |
|
Liangsheng Yin
|
37b42297f8
|
import outlines (#168)
|
2024-02-09 10:13:02 +08:00 |
|
Lianmin Zheng
|
23f05005fd
|
Format code & move functions (#155)
|
2024-02-06 13:27:46 -08:00 |
|
Cody Yu
|
a7334aeea1
|
Support decode token logprobs (#130)
|
2024-02-06 12:24:55 -08:00 |
|
Liangsheng Yin
|
26f0bedc8f
|
jump-forward rename (#144)
|
2024-02-05 16:50:37 +08:00 |
|
Keith Stevens
|
1d0fbe8e43
|
[Feature] Adds basic support for image content in OpenAI chat routes (#113)
|
2024-01-30 06:12:33 -08:00 |
|
Lianmin Zheng
|
97aa9b3284
|
Improve docs & Add JSON decode example (#121)
|
2024-01-30 05:45:27 -08:00 |
|
Lianmin Zheng
|
6f560c761b
|
Improve the control of streaming and improve the first token latency in streaming (#117)
|
2024-01-29 17:05:42 -08:00 |
|
parasol-aser
|
23950056f0
|
support speculative execution for openai API (#48)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-01-25 01:57:06 -08:00 |
|
Liangsheng Yin
|
01ee0fbc05
|
fast regex decode
Auto-detect constant str path in regex FSM, then extend instead.
|
2024-01-25 01:16:25 +08:00 |
|
Lianmin Zheng
|
9a16fea012
|
Return logprob for choices (#87)
|
2024-01-23 05:07:30 -08:00 |
|
Liangsheng Yin
|
40ab1f0129
|
Fix the possible bug of decode out of memory (#36)
|
2024-01-19 11:01:15 -08:00 |
|
Cody Yu
|
23471f9aa3
|
Support v1/chat/completions (#50)
|
2024-01-18 23:43:09 -08:00 |
|
Cody Yu
|
61d4c93962
|
Support stream=True in v1/completions (#49)
|
2024-01-18 17:00:56 -08:00 |
|
Lianmin Zheng
|
bf51ddc6e5
|
Improve docs & Rename Gemini -> VertexAI (#19)
|
2024-01-17 02:54:41 -08:00 |
|
shiyi.c_98
|
fd7c479239
|
Gemini Backend (#9)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-01-16 22:29:37 -08:00 |
|
Lianmin Zheng
|
4bd8233f2c
|
Fix test cases (#6)
|
2024-01-15 01:15:53 -08:00 |
|
Liangsheng Yin
|
331848de9d
|
Add SRT json decode example (#2)
|
2024-01-09 12:35:44 -08:00 |
|
Lianmin Zheng
|
22085081bb
|
release initial code
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com>
Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
|
2024-01-08 04:37:50 +00:00 |
|