Lianmin Zheng
|
fb2d0680e0
|
[Fix] Fix clean_up_tokenization_spaces in tokenizer (#1510)
|
2024-09-24 21:37:33 -07:00 |
|
Lianmin Zheng
|
2854a5ea9f
|
Fix the overhead due to penalizer in bench_latency (#1496)
|
2024-09-23 07:38:14 -07:00 |
|
Lianmin Zheng
|
167591e864
|
Better unit tests for adding a new model (#1488)
|
2024-09-22 01:50:37 -07:00 |
|
Lianmin Zheng
|
2cd7e181dd
|
Fix env vars in bench_latency (#1472)
|
2024-09-19 03:19:26 -07:00 |
|
Ying Sheng
|
37963394aa
|
[Feature] Support LoRA path renaming and add LoRA serving benchmarks (#1433)
|
2024-09-15 12:46:04 -07:00 |
|
Ying Sheng
|
712216928f
|
[Feature] Initial support for multi-LoRA serving (#1307)
|
2024-09-12 16:46:14 -07:00 |
|
Lianmin Zheng
|
3a6e8b6d78
|
[Minor] move triton attention kernels into a separate folder (#1379)
|
2024-09-10 15:15:08 -07:00 |
|
Lianmin Zheng
|
f64eae3a29
|
[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping (#1308)
|
2024-09-02 21:44:45 -07:00 |
|
Lianmin Zheng
|
f6af3a6561
|
Cleanup readme, llava examples, usage examples and nccl init (#1194)
|
2024-08-24 08:02:23 -07:00 |
|
Ying Sheng
|
0909bb0d2f
|
[Feat] Add window attention for gemma-2 (#1056)
|
2024-08-13 17:01:26 -07:00 |
|
foszto
|
c62d560c03
|
#590 Increase default , track changes in examples and documentation (#971)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-08-08 00:54:46 +00:00 |
|
Ying Sheng
|
3bc99e6fe4
|
Test openai vision api (#925)
|
2024-08-05 13:51:55 +10:00 |
|
Ying Sheng
|
995af5a54b
|
Improve the structure of CI (#911)
|
2024-08-03 23:09:21 -07:00 |
|
Ying Sheng
|
4075677621
|
Add OpenAI backend to the CI test (#869)
|
2024-08-01 09:25:24 -07:00 |
|
zhyncs
|
2e341cd493
|
misc: add pre-commit config (#637)
|
2024-07-17 11:55:39 -07:00 |
|
Liangsheng Yin
|
95c4e0dfac
|
Format Benchmark Code (#399)
|
2024-04-28 21:06:22 +08:00 |
|
Christopher Chou
|
864425300f
|
Yi-VL Model (#112)
|
2024-02-01 08:33:22 -08:00 |
|
Lianmin Zheng
|
70359bf31a
|
Update benchmark scripts (#8)
|
2024-01-15 16:12:57 -08:00 |
|
Lianmin Zheng
|
30720e732c
|
Add install with pip (#3)
|
2024-01-09 12:43:40 -08:00 |
|