Lianmin Zheng
|
3c1f5a9220
|
Fix duplicated imports in hf_transformers_utils.py (#1141)
|
2024-08-17 18:03:00 -07:00 |
|
Lianmin Zheng
|
57d0bd91ec
|
Improve benchmark (#1140)
|
2024-08-17 17:43:23 -07:00 |
|
Lianmin Zheng
|
41598e0d8e
|
Add longer accuracy test on CI (#1049)
|
2024-08-12 09:21:38 +00:00 |
|
Lianmin Zheng
|
8207637029
|
Improve end-to-end throughput test and its coverage (#1039)
|
2024-08-11 18:27:33 -07:00 |
|
Roger Wang
|
05c50a82b8
|
Minor bugfix on benchmark serving (#1005)
|
2024-08-10 02:53:50 +10:00 |
|
Juwan Yoo
|
ab7875941b
|
feat: frequency, min_new_tokens, presence, and repetition penalties (#973)
|
2024-08-08 04:21:08 -07:00 |
|
Ying Sheng
|
ae7ee01a8e
|
Add accuracy test to CI: MMLU (#882)
|
2024-08-01 21:20:17 -07:00 |
|
Yineng Zhang
|
f52eda35ea
|
misc: update e2e test benchmark config (#825)
|
2024-07-30 19:19:23 +10:00 |
|
Ying Sheng
|
ae5c0fc442
|
Support disable_ignore_eos in bench_serving.py (#824)
|
2024-07-30 01:42:07 -07:00 |
|
Ying Sheng
|
db6089e6f3
|
Revert "Organize public APIs" (#815)
|
2024-07-29 19:40:28 -07:00 |
|
Liangsheng Yin
|
c8e9fed87a
|
Organize public APIs (#809)
|
2024-07-29 15:34:16 -07:00 |
|
Yineng Zhang
|
768e05d08f
|
fix benchmark (#743)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-07-26 21:26:13 +10:00 |
|
Ying Sheng
|
30d8e130e7
|
Improve benchmark scripts (#717)
|
2024-07-24 14:44:14 -07:00 |
|
zhyncs
|
fa7ccb3316
|
feat: add e2e latency (#704)
|
2024-07-24 05:51:10 +10:00 |
|
zhyncs
|
9fdea29d05
|
misc: fix typo (#698)
|
2024-07-23 02:00:27 +10:00 |
|
Ying Sheng
|
df7c4c19b4
|
Fix trt benchmark (#697)
|
2024-07-22 23:32:41 +10:00 |
|
zhyncs
|
d198791fe8
|
misc: update output token logic (#695)
|
2024-07-22 19:34:05 +10:00 |
|
zhyncs
|
c07526e46c
|
fix: update bench serving (#694)
|
2024-07-22 18:23:33 +10:00 |
|
zhyncs
|
65bd13386b
|
misc: recommend to use chat model for benchmark (#690)
|
2024-07-22 00:13:33 +10:00 |
|
zhyncs
|
6a846bb1fd
|
misc: update output file logic (#686)
|
2024-07-21 18:07:30 +10:00 |
|
zhyncs
|
0fdb3127a1
|
feat: update bench serving (#685)
|
2024-07-21 16:46:58 +10:00 |
|
Lianmin Zheng
|
77e592e8e0
|
support non-streaming benchmark (#682)
|
2024-07-20 18:36:42 -07:00 |
|
zhyncs
|
4b4a67f814
|
feat: support TRT LLM benchmark and multiple benchmarks (#670)
|
2024-07-20 11:05:35 -07:00 |
|
Lianmin Zheng
|
9592a1f3bd
|
Fix random dataset (#671)
|
2024-07-20 01:57:43 -07:00 |
|
Lianmin Zheng
|
35759efa91
|
Support random dataset in bench_serving.py (#669)
|
2024-07-20 01:06:43 -07:00 |
|
Ying Sheng
|
11c8efff73
|
Add benchmark instructions (#663)
|
2024-07-19 11:12:23 -07:00 |
|
Ying Sheng
|
51fda1439f
|
Update Readme (#660)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2024-07-19 09:54:01 -07:00 |
|