Commit Graph

27 Commits

Author SHA1 Message Date
Lianmin Zheng
3c1f5a9220 Fix duplicated imports in hf_transformers_utils.py (#1141) 2024-08-17 18:03:00 -07:00
Lianmin Zheng
57d0bd91ec Improve benchmark (#1140) 2024-08-17 17:43:23 -07:00
Lianmin Zheng
41598e0d8e Add longer accuracy test on CI (#1049) 2024-08-12 09:21:38 +00:00
Lianmin Zheng
8207637029 Improve end-to-end throughput test and its coverage (#1039) 2024-08-11 18:27:33 -07:00
Roger Wang
05c50a82b8 Minor bugfix on benchmark serving (#1005) 2024-08-10 02:53:50 +10:00
Juwan Yoo
ab7875941b feat: frequency, min_new_tokens, presence, and repetition penalties (#973) 2024-08-08 04:21:08 -07:00
Ying Sheng
ae7ee01a8e Add accuracy test to CI: MMLU (#882) 2024-08-01 21:20:17 -07:00
Yineng Zhang
f52eda35ea misc: update e2e test benchmark config (#825) 2024-07-30 19:19:23 +10:00
Ying Sheng
ae5c0fc442 Support disable_ignore_eos in bench_serving.py (#824) 2024-07-30 01:42:07 -07:00
Ying Sheng
db6089e6f3 Revert "Organize public APIs" (#815) 2024-07-29 19:40:28 -07:00
Liangsheng Yin
c8e9fed87a Organize public APIs (#809) 2024-07-29 15:34:16 -07:00
Yineng Zhang
768e05d08f fix benchmark (#743)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-07-26 21:26:13 +10:00
Ying Sheng
30d8e130e7 Improve benchmark scripts (#717) 2024-07-24 14:44:14 -07:00
zhyncs
fa7ccb3316 feat: add e2e latency (#704) 2024-07-24 05:51:10 +10:00
zhyncs
9fdea29d05 misc: fix typo (#698) 2024-07-23 02:00:27 +10:00
Ying Sheng
df7c4c19b4 Fix trt benchmark (#697) 2024-07-22 23:32:41 +10:00
zhyncs
d198791fe8 misc: update output token logic (#695) 2024-07-22 19:34:05 +10:00
zhyncs
c07526e46c fix: update bench serving (#694) 2024-07-22 18:23:33 +10:00
zhyncs
65bd13386b misc: recommend to use chat model for benchmark (#690) 2024-07-22 00:13:33 +10:00
zhyncs
6a846bb1fd misc: update output file logic (#686) 2024-07-21 18:07:30 +10:00
zhyncs
0fdb3127a1 feat: update bench serving (#685) 2024-07-21 16:46:58 +10:00
Lianmin Zheng
77e592e8e0 support non-streaming benchmark (#682) 2024-07-20 18:36:42 -07:00
zhyncs
4b4a67f814 feat: support TRT LLM benchmark and multiple benchmarks (#670) 2024-07-20 11:05:35 -07:00
Lianmin Zheng
9592a1f3bd Fix random dataset (#671) 2024-07-20 01:57:43 -07:00
Lianmin Zheng
35759efa91 Support random dataset in bench_serving.py (#669) 2024-07-20 01:06:43 -07:00
Ying Sheng
11c8efff73 Add benchmark instructions (#663) 2024-07-19 11:12:23 -07:00
Ying Sheng
51fda1439f Update Readme (#660)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2024-07-19 09:54:01 -07:00