Commit Graph

518 Commits

Author SHA1 Message Date
yichuan~
ca600e8cd6 Add support for logprobs in OpenAI chat API (#852) 2024-08-01 00:08:21 -07:00
Kai Fronsdal
0c0c81372e Fix #857 (#858) 2024-08-01 00:05:39 -07:00
Ying Sheng
90286d8576 Add troubleshooting doc (#856) 2024-08-01 00:05:26 -07:00
Ying Sheng
5e7dd984fe Fix llama for classification (#855) 2024-07-31 15:48:31 -07:00
Yineng Zhang
bc3eaac2b8 chore: update flashinfer to v0.1.3 (#850) 2024-08-01 04:37:05 +10:00
Yineng Zhang
a78d98de19 misc: update e2e test paths config (#848) 2024-07-31 18:37:29 +10:00
Ikko Eltociear Ashimine
7d5ed7c6ee docs: update README.md (#843) 2024-07-31 12:48:18 +10:00
Liangsheng Yin
a6c7ebbbcb Add req slots leaking check (#842) 2024-07-30 18:29:01 -07:00
yichuan~
bb0501c0d9 Fix List input bug (#838) 2024-07-30 13:40:51 -07:00
Liangsheng Yin
6b0f2e9088 Add --max-total-tokens (#840) 2024-07-30 13:33:55 -07:00
Yineng Zhang
1edd4e07d6 chore: bump v0.2.7 (#830) 2024-07-30 20:41:10 +10:00
Yineng Zhang
62c673c46f docs: add set up runner (#829) 2024-07-30 19:43:40 +10:00
Yineng Zhang
377c5dc9a9 misc: enable e2e test when push (#828) 2024-07-30 19:26:23 +10:00
Yineng Zhang
f52eda35ea misc: update e2e test benchmark config (#825) 2024-07-30 19:19:23 +10:00
Ying Sheng
b579ecf028 Add awq_marlin (#826) 2024-07-30 02:04:51 -07:00
Ying Sheng
e7487b08bc Adjust default mem fraction to avoid OOM (#823) 2024-07-30 01:58:31 -07:00
Ying Sheng
ae5c0fc442 Support disable_ignore_eos in bench_serving.py (#824) 2024-07-30 01:42:07 -07:00
Yineng Zhang
a30d5d75bf feat: add pr e2e test (#822) 2024-07-30 18:31:26 +10:00
Yineng Zhang
17af39c5dc feat: add runner (#821) 2024-07-30 17:32:13 +10:00
ObjectNotFound
daf593a385 Fix streaming bug (#820) 2024-07-30 00:32:07 -07:00
Yineng Zhang
bece265f5a docs: update README (#819) 2024-07-30 16:17:50 +10:00
Liangsheng Yin
cdcbde5fc3 Code structure refactor (#807) 2024-07-29 23:04:48 -07:00
Enrique Shockwave
21e22b9e96 Fix LiteLLM kwargs (#817) 2024-07-29 22:38:02 -07:00
Yineng Zhang
a50c8a14b3 fix: use v0.2.5 for benchmark (#814) 2024-07-30 12:40:35 +10:00
Ying Sheng
db6089e6f3 Revert "Organize public APIs" (#815) 2024-07-29 19:40:28 -07:00
Liangsheng Yin
3520f75fb1 Remove inf value for chunked prefill size (#812) 2024-07-29 18:34:25 -07:00
Liangsheng Yin
c8e9fed87a Organize public APIs (#809) 2024-07-29 15:34:16 -07:00
yichuan~
084fa54d37 Add support for OpenAI API : offline batch(file) processing (#699)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
2024-07-29 13:07:18 -07:00
Ying Sheng
eba458bd19 Revert "Revert "fix: update flashinfer to 0.1.2 to fix sampling for cu118"" (#806) 2024-07-29 12:20:42 -07:00
Yineng Zhang
3d1cb0af83 feat: add chat template for internlm2-chat (#802) 2024-07-30 03:18:03 +08:00
Ying Sheng
7d352b4fdd Revert "fix: update flashinfer to 0.1.2 to fix sampling for cu118" (#805) 2024-07-29 11:39:12 -07:00
Yineng Zhang
87064015d9 fix: update flashinfer to 0.1.2 to fix sampling for cu118 (#803) 2024-07-29 11:00:52 -07:00
Liangsheng Yin
7cd4f244a4 Chunked prefill (#800) 2024-07-29 03:32:58 -07:00
Ying Sheng
98111fbe3e Revert "Chunked prefill support" (#799) 2024-07-29 02:38:31 -07:00
Liangsheng Yin
2ec39ab712 Chunked prefill support (#797) 2024-07-29 02:21:50 -07:00
ObjectNotFound
8f6274c82b Add role documentation, add system begin & end tokens (#793) 2024-07-28 23:02:49 -07:00
Ying Sheng
325a06c2de Fix logging (#796) 2024-07-28 23:01:45 -07:00
Ying Sheng
79f816292e Fix lazy import location (#795) 2024-07-28 22:09:50 -07:00
Eric Yoon
b688fd858d Lazy-import third-party backends (#794) 2024-07-28 21:57:41 -07:00
Ying Sheng
5bd899243b Update README.md (#792) 2024-07-28 21:57:23 -07:00
Ying Sheng
8d908a937c Fix echo + lobprob for OpenAI API when the prompt is a list (#791) 2024-07-28 17:09:16 -07:00
Yineng Zhang
dd7e8b9421 chore: add copyright for srt (#790) 2024-07-28 23:07:12 +10:00
Yineng Zhang
1f013d64eb docs: make badges center (#789) 2024-07-28 22:27:52 +10:00
Yineng Zhang
628e1fa760 docs: update README (#788) 2024-07-28 22:24:27 +10:00
Ying Sheng
c71880f896 Vectorize logprobs computation (#787) 2024-07-28 05:22:14 -07:00
Ying Sheng
bcb6611a46 Update README.md 2024-07-28 01:00:06 -07:00
Yineng Zhang
fa2aa0db0a docs: update index (#786) 2024-07-28 17:22:00 +10:00
Yineng Zhang
6a387a69cc fix: exclude logo png in gitignore (#785) 2024-07-28 17:08:16 +10:00
Yineng Zhang
27f5ce0a6c fix: init readthedocs support (#784) 2024-07-28 16:55:54 +10:00
Yineng Zhang
948625799e docs: init readthedocs support (#783) 2024-07-28 16:50:31 +10:00