Commit Graph

61 Commits

Author SHA1 Message Date
Aidan Cooper
94e0115186 Feat: add alternative choices selection methods (#835) 2024-08-05 03:27:49 -07:00
yichuan~
fd7926e46e Fix prompt len in parallel sampling (#928) 2024-08-05 00:56:08 -07:00
Ying Sheng
0a4f5f9bea Test regex in vision api (#926) 2024-08-04 22:52:41 -07:00
Ying Sheng
3bc99e6fe4 Test openai vision api (#925) 2024-08-05 13:51:55 +10:00
yichuan~
d53dcf9c98 Support more OpenAI API test (#916) 2024-08-04 16:43:09 -07:00
Liangsheng Yin
bb66cc4c52 Fix CI && python3.8 compatible (#920) 2024-08-04 16:02:05 -07:00
Ying Sheng
0d4f3a9fcd Make API Key OpenAI-compatible (#917) 2024-08-04 13:35:44 -07:00
Ying Sheng
995af5a54b Improve the structure of CI (#911) 2024-08-03 23:09:21 -07:00
Ying Sheng
70cc0749ce Add model accuracy test - step 1 (#866) 2024-08-03 18:20:50 -07:00
Yineng Zhang
2e218b9e04 fix: set env in runner (#891) 2024-08-02 20:48:56 +10:00
Ying Sheng
ae7ee01a8e Add accuracy test to CI: MMLU (#882) 2024-08-01 21:20:17 -07:00
Ying Sheng
60340a3643 Improve the coverage of the openai api server test (#878) 2024-08-01 16:01:30 -07:00
Ying Sheng
72b6ea88b4 Make scripts under /test/srt as unit tests (#875) 2024-08-01 14:34:55 -07:00
Ying Sheng
6f221d4ca0 Fix unit tests for the frontend language part (#872) 2024-08-01 12:39:12 -07:00
Ying Sheng
4075677621 Add OpenAI backend to the CI test (#869) 2024-08-01 09:25:24 -07:00
Lianmin Zheng
30db99b3d9 Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776) 2024-07-27 19:50:34 -07:00
Ying Sheng
51fda1439f Update Readme (#660)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2024-07-19 09:54:01 -07:00
Ying Sheng
dc1b8bcfaa Format (#593) 2024-07-05 10:06:17 -07:00
Lianmin Zheng
eb1ae6ae0c Add sglang.bench_latency for offline benchmark (#564) 2024-06-25 03:38:04 -07:00
Liangsheng Yin
05471f2103 Update test_flashinfer (#560) 2024-06-24 15:23:57 +08:00
Lianmin Zheng
1fa15099d8 Add LlamaForClassification (#559) 2024-06-22 00:49:31 -07:00
Ying Sheng
fb9296f0ed Higher priority for user input of max_prefill_tokens & format (#540) 2024-06-12 21:48:40 -07:00
胡译文
87260b7bfd Litellm Backend (#502) 2024-06-07 12:24:28 -07:00
Ying Sheng
0463f7fb52 Support data parallelism (static) (#480)
Co-authored-by: Ying Sheng <ying.sheng@databricks.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2024-05-27 21:24:10 -07:00
Ying Sheng
3e684be7a3 Fix openai speculative execution (#456) 2024-05-20 17:01:13 -07:00
Liangsheng Yin
690d162d97 Format code (#441) 2024-05-14 22:40:46 +08:00
Lianmin Zheng
5dc55a5f02 Handle truncation errors (#436) 2024-05-13 15:56:00 -07:00
Lianmin Zheng
6e09cf6a15 Misc fixes (#432) 2024-05-12 15:05:40 -07:00
Lianmin Zheng
aee4f523cf Fix logit processor bugs (#427) 2024-05-12 04:54:07 -07:00
Lianmin Zheng
7023f413c6 Clean up (#422) 2024-05-11 20:55:00 -07:00
Liangsheng Yin
19818b9c2f Minor: style improvement of radix_cache and memory_pool (#395) 2024-04-26 01:01:36 +08:00
Liangsheng Yin
150d7020ed Revert removing the unused imports (#385) 2024-04-23 22:36:33 +08:00
Liangsheng Yin
9acc6e3504 add .isort.cfg (#378) 2024-04-22 22:38:09 +08:00
Fronx
2b6d999191 Fix issue #367 – System message not supported for Anthropic (anthropic.BadRequestError) (#368)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-04-16 11:18:24 -07:00
Lianmin Zheng
65501a9cf1 Fix commandr import; format code 2024-04-16 18:10:12 +00:00
Liangsheng Yin
62b3812b69 Time cost utils (#355) 2024-04-09 23:27:31 +08:00
Liangsheng Yin
3842eba5fa Logprobs Refractor (#331) 2024-03-28 14:34:49 +08:00
Jani Monoses
e57f079275 Use Anthropic messages API (#304) 2024-03-22 13:23:31 -07:00
Lianmin Zheng
4aa5dd2c5f Update version to v0.1.13 (#280) 2024-03-11 05:49:27 -07:00
Liangsheng Yin
1b35547927 Organize server_args (#277) 2024-03-11 20:06:52 +08:00
Lianmin Zheng
faba293a0d Improve gemma and documentations (#278) 2024-03-11 04:43:39 -07:00
Lianmin Zheng
c51020cf0c Fix the chat template for llava-v1.6-34b & format code (#177) 2024-02-11 05:50:13 -08:00
Cody Yu
50afed4eaa Support extra field regex in OpenAI API (#172) 2024-02-10 17:21:33 -08:00
Liangsheng Yin
37b42297f8 import outlines (#168) 2024-02-09 10:13:02 +08:00
Lianmin Zheng
23f05005fd Format code & move functions (#155) 2024-02-06 13:27:46 -08:00
Cody Yu
a7334aeea1 Support decode token logprobs (#130) 2024-02-06 12:24:55 -08:00
Liangsheng Yin
26f0bedc8f jump-forward rename (#144) 2024-02-05 16:50:37 +08:00
Keith Stevens
1d0fbe8e43 [Feature] Adds basic support for image content in OpenAI chat routes (#113) 2024-01-30 06:12:33 -08:00
Lianmin Zheng
97aa9b3284 Improve docs & Add JSON decode example (#121) 2024-01-30 05:45:27 -08:00
Lianmin Zheng
6f560c761b Improve the control of streaming and improve the first token latency in streaming (#117) 2024-01-29 17:05:42 -08:00