yichuan~
|
ca600e8cd6
|
Add support for logprobs in OpenAI chat API (#852)
|
2024-08-01 00:08:21 -07:00 |
|
Kai Fronsdal
|
0c0c81372e
|
Fix #857 (#858)
|
2024-08-01 00:05:39 -07:00 |
|
Ying Sheng
|
90286d8576
|
Add troubleshooting doc (#856)
|
2024-08-01 00:05:26 -07:00 |
|
Ying Sheng
|
5e7dd984fe
|
Fix llama for classification (#855)
|
2024-07-31 15:48:31 -07:00 |
|
Yineng Zhang
|
bc3eaac2b8
|
chore: update flashinfer to v0.1.3 (#850)
|
2024-08-01 04:37:05 +10:00 |
|
Yineng Zhang
|
a78d98de19
|
misc: update e2e test paths config (#848)
|
2024-07-31 18:37:29 +10:00 |
|
Ikko Eltociear Ashimine
|
7d5ed7c6ee
|
docs: update README.md (#843)
|
2024-07-31 12:48:18 +10:00 |
|
Liangsheng Yin
|
a6c7ebbbcb
|
Add req slots leaking check (#842)
|
2024-07-30 18:29:01 -07:00 |
|
yichuan~
|
bb0501c0d9
|
Fix List input bug (#838)
|
2024-07-30 13:40:51 -07:00 |
|
Liangsheng Yin
|
6b0f2e9088
|
Add --max-total-tokens (#840)
|
2024-07-30 13:33:55 -07:00 |
|
Yineng Zhang
|
1edd4e07d6
|
chore: bump v0.2.7 (#830)
|
2024-07-30 20:41:10 +10:00 |
|
Yineng Zhang
|
62c673c46f
|
docs: add set up runner (#829)
|
2024-07-30 19:43:40 +10:00 |
|
Yineng Zhang
|
377c5dc9a9
|
misc: enable e2e test when push (#828)
|
2024-07-30 19:26:23 +10:00 |
|
Yineng Zhang
|
f52eda35ea
|
misc: update e2e test benchmark config (#825)
|
2024-07-30 19:19:23 +10:00 |
|
Ying Sheng
|
b579ecf028
|
Add awq_marlin (#826)
|
2024-07-30 02:04:51 -07:00 |
|
Ying Sheng
|
e7487b08bc
|
Adjust default mem fraction to avoid OOM (#823)
|
2024-07-30 01:58:31 -07:00 |
|
Ying Sheng
|
ae5c0fc442
|
Support disable_ignore_eos in bench_serving.py (#824)
|
2024-07-30 01:42:07 -07:00 |
|
Yineng Zhang
|
a30d5d75bf
|
feat: add pr e2e test (#822)
|
2024-07-30 18:31:26 +10:00 |
|
Yineng Zhang
|
17af39c5dc
|
feat: add runner (#821)
|
2024-07-30 17:32:13 +10:00 |
|
ObjectNotFound
|
daf593a385
|
Fix streaming bug (#820)
|
2024-07-30 00:32:07 -07:00 |
|
Yineng Zhang
|
bece265f5a
|
docs: update README (#819)
|
2024-07-30 16:17:50 +10:00 |
|
Liangsheng Yin
|
cdcbde5fc3
|
Code structure refactor (#807)
|
2024-07-29 23:04:48 -07:00 |
|
Enrique Shockwave
|
21e22b9e96
|
Fix LiteLLM kwargs (#817)
|
2024-07-29 22:38:02 -07:00 |
|
Yineng Zhang
|
a50c8a14b3
|
fix: use v0.2.5 for benchmark (#814)
|
2024-07-30 12:40:35 +10:00 |
|
Ying Sheng
|
db6089e6f3
|
Revert "Organize public APIs" (#815)
|
2024-07-29 19:40:28 -07:00 |
|
Liangsheng Yin
|
3520f75fb1
|
Remove inf value for chunked prefill size (#812)
|
2024-07-29 18:34:25 -07:00 |
|
Liangsheng Yin
|
c8e9fed87a
|
Organize public APIs (#809)
|
2024-07-29 15:34:16 -07:00 |
|
yichuan~
|
084fa54d37
|
Add support for OpenAI API : offline batch(file) processing (#699)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
|
2024-07-29 13:07:18 -07:00 |
|
Ying Sheng
|
eba458bd19
|
Revert "Revert "fix: update flashinfer to 0.1.2 to fix sampling for cu118"" (#806)
|
2024-07-29 12:20:42 -07:00 |
|
Yineng Zhang
|
3d1cb0af83
|
feat: add chat template for internlm2-chat (#802)
|
2024-07-30 03:18:03 +08:00 |
|
Ying Sheng
|
7d352b4fdd
|
Revert "fix: update flashinfer to 0.1.2 to fix sampling for cu118" (#805)
|
2024-07-29 11:39:12 -07:00 |
|
Yineng Zhang
|
87064015d9
|
fix: update flashinfer to 0.1.2 to fix sampling for cu118 (#803)
|
2024-07-29 11:00:52 -07:00 |
|
Liangsheng Yin
|
7cd4f244a4
|
Chunked prefill (#800)
|
2024-07-29 03:32:58 -07:00 |
|
Ying Sheng
|
98111fbe3e
|
Revert "Chunked prefill support" (#799)
|
2024-07-29 02:38:31 -07:00 |
|
Liangsheng Yin
|
2ec39ab712
|
Chunked prefill support (#797)
|
2024-07-29 02:21:50 -07:00 |
|
ObjectNotFound
|
8f6274c82b
|
Add role documentation, add system begin & end tokens (#793)
|
2024-07-28 23:02:49 -07:00 |
|
Ying Sheng
|
325a06c2de
|
Fix logging (#796)
|
2024-07-28 23:01:45 -07:00 |
|
Ying Sheng
|
79f816292e
|
Fix lazy import location (#795)
|
2024-07-28 22:09:50 -07:00 |
|
Eric Yoon
|
b688fd858d
|
Lazy-import third-party backends (#794)
|
2024-07-28 21:57:41 -07:00 |
|
Ying Sheng
|
5bd899243b
|
Update README.md (#792)
|
2024-07-28 21:57:23 -07:00 |
|
Ying Sheng
|
8d908a937c
|
Fix echo + lobprob for OpenAI API when the prompt is a list (#791)
|
2024-07-28 17:09:16 -07:00 |
|
Yineng Zhang
|
dd7e8b9421
|
chore: add copyright for srt (#790)
|
2024-07-28 23:07:12 +10:00 |
|
Yineng Zhang
|
1f013d64eb
|
docs: make badges center (#789)
|
2024-07-28 22:27:52 +10:00 |
|
Yineng Zhang
|
628e1fa760
|
docs: update README (#788)
|
2024-07-28 22:24:27 +10:00 |
|
Ying Sheng
|
c71880f896
|
Vectorize logprobs computation (#787)
|
2024-07-28 05:22:14 -07:00 |
|
Ying Sheng
|
bcb6611a46
|
Update README.md
|
2024-07-28 01:00:06 -07:00 |
|
Yineng Zhang
|
fa2aa0db0a
|
docs: update index (#786)
|
2024-07-28 17:22:00 +10:00 |
|
Yineng Zhang
|
6a387a69cc
|
fix: exclude logo png in gitignore (#785)
|
2024-07-28 17:08:16 +10:00 |
|
Yineng Zhang
|
27f5ce0a6c
|
fix: init readthedocs support (#784)
|
2024-07-28 16:55:54 +10:00 |
|
Yineng Zhang
|
948625799e
|
docs: init readthedocs support (#783)
|
2024-07-28 16:50:31 +10:00 |
|