Commit Graph

541 Commits

Author SHA1 Message Date
Ying Sheng
30a9b2ef20 Bump version to v0.2.9 (#890) 2024-08-02 01:45:48 -07:00
Ying Sheng
3cadecf0c4 Increase openai client limit (#886) 2024-08-02 00:47:23 -07:00
Ying Sheng
e90e3a50d4 Add benchmark: HumanEval (#889) 2024-08-02 00:46:41 -07:00
Ying Sheng
fbd6b94d69 Fix the double BOS problem in the HF chat template (#888) 2024-08-02 00:30:50 -07:00
Ying Sheng
4c8093c8db Update workflow name (#883) 2024-08-01 21:29:46 -07:00
Ying Sheng
ae7ee01a8e Add accuracy test to CI: MMLU (#882) 2024-08-01 21:20:17 -07:00
Ying Sheng
76e59088d8 Add more unit tests to CI (#880) 2024-08-01 18:14:33 -07:00
Liangsheng Yin
12ce3befb6 Update runner docs (#879) 2024-08-01 17:37:47 -07:00
任嘉
4013a4e1b0 Implement served_model_name to customize model id when use local mode… (#749)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-08-01 17:13:51 -07:00
Ying Sheng
60340a3643 Improve the coverage of the openai api server test (#878) 2024-08-01 16:01:30 -07:00
Liangsheng Yin
70c78cfb03 Update runner docs (#876) 2024-08-01 15:32:33 -07:00
Ying Sheng
72b6ea88b4 Make scripts under /test/srt as unit tests (#875) 2024-08-01 14:34:55 -07:00
Ying Sheng
e4d3333c6c bump to 0.2.8 (#877) 2024-08-01 14:18:26 -07:00
Ying Sheng
6f221d4ca0 Fix unit tests for the frontend language part (#872) 2024-08-01 12:39:12 -07:00
Yineng Zhang
aba6f51f88 misc: update unit test config (#873) 2024-08-02 05:27:05 +10:00
Yineng Zhang
7f6c690b67 misc: use pip cache purge and add unit test ci (#871) 2024-08-02 05:12:20 +10:00
Ying Sheng
40e6f5131a Fix openai CI tests (#870) 2024-08-01 09:39:09 -07:00
Ying Sheng
4075677621 Add OpenAI backend to the CI test (#869) 2024-08-01 09:25:24 -07:00
Yineng Zhang
9e8d2c7f74 misc: add cancel previous at e2e (#864) 2024-08-01 18:26:54 +10:00
Yineng Zhang
c9bff5fcc8 misc: disable auto release (#862) 2024-08-01 17:46:51 +10:00
Ying Sheng
b04444ac01 Rename github workflows (#861) 2024-08-01 00:39:55 -07:00
Yineng Zhang
3d617a21ba misc: update e2e test paths config (#860) 2024-08-01 17:38:24 +10:00
Liangsheng Yin
c020f9ceda Support chunked prefill when radix cache is disabled (#811) 2024-08-01 00:29:01 -07:00
yichuan~
ca600e8cd6 Add support for logprobs in OpenAI chat API (#852) 2024-08-01 00:08:21 -07:00
Kai Fronsdal
0c0c81372e Fix #857 (#858) 2024-08-01 00:05:39 -07:00
Ying Sheng
90286d8576 Add troubleshooting doc (#856) 2024-08-01 00:05:26 -07:00
Ying Sheng
5e7dd984fe Fix llama for classification (#855) 2024-07-31 15:48:31 -07:00
Yineng Zhang
bc3eaac2b8 chore: update flashinfer to v0.1.3 (#850) 2024-08-01 04:37:05 +10:00
Yineng Zhang
a78d98de19 misc: update e2e test paths config (#848) 2024-07-31 18:37:29 +10:00
Ikko Eltociear Ashimine
7d5ed7c6ee docs: update README.md (#843) 2024-07-31 12:48:18 +10:00
Liangsheng Yin
a6c7ebbbcb Add req slots leaking check (#842) 2024-07-30 18:29:01 -07:00
yichuan~
bb0501c0d9 Fix List input bug (#838) 2024-07-30 13:40:51 -07:00
Liangsheng Yin
6b0f2e9088 Add --max-total-tokens (#840) 2024-07-30 13:33:55 -07:00
Yineng Zhang
1edd4e07d6 chore: bump v0.2.7 (#830) 2024-07-30 20:41:10 +10:00
Yineng Zhang
62c673c46f docs: add set up runner (#829) 2024-07-30 19:43:40 +10:00
Yineng Zhang
377c5dc9a9 misc: enable e2e test when push (#828) 2024-07-30 19:26:23 +10:00
Yineng Zhang
f52eda35ea misc: update e2e test benchmark config (#825) 2024-07-30 19:19:23 +10:00
Ying Sheng
b579ecf028 Add awq_marlin (#826) 2024-07-30 02:04:51 -07:00
Ying Sheng
e7487b08bc Adjust default mem fraction to avoid OOM (#823) 2024-07-30 01:58:31 -07:00
Ying Sheng
ae5c0fc442 Support disable_ignore_eos in bench_serving.py (#824) 2024-07-30 01:42:07 -07:00
Yineng Zhang
a30d5d75bf feat: add pr e2e test (#822) 2024-07-30 18:31:26 +10:00
Yineng Zhang
17af39c5dc feat: add runner (#821) 2024-07-30 17:32:13 +10:00
ObjectNotFound
daf593a385 Fix streaming bug (#820) 2024-07-30 00:32:07 -07:00
Yineng Zhang
bece265f5a docs: update README (#819) 2024-07-30 16:17:50 +10:00
Liangsheng Yin
cdcbde5fc3 Code structure refactor (#807) 2024-07-29 23:04:48 -07:00
Enrique Shockwave
21e22b9e96 Fix LiteLLM kwargs (#817) 2024-07-29 22:38:02 -07:00
Yineng Zhang
a50c8a14b3 fix: use v0.2.5 for benchmark (#814) 2024-07-30 12:40:35 +10:00
Ying Sheng
db6089e6f3 Revert "Organize public APIs" (#815) 2024-07-29 19:40:28 -07:00
Liangsheng Yin
3520f75fb1 Remove inf value for chunked prefill size (#812) 2024-07-29 18:34:25 -07:00
Liangsheng Yin
c8e9fed87a Organize public APIs (#809) 2024-07-29 15:34:16 -07:00