Ying Sheng
|
fb9296f0ed
|
Higher priority for user input of max_prefill_tokens & format (#540)
|
2024-06-12 21:48:40 -07:00 |
|
Lianmin Zheng
|
f6dbd24043
|
Improve doc strings (#518)
|
2024-06-08 02:39:32 -07:00 |
|
Ying Sheng
|
0463f7fb52
|
Support data parallelism (static) (#480)
Co-authored-by: Ying Sheng <ying.sheng@databricks.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2024-05-27 21:24:10 -07:00 |
|
Lianmin Zheng
|
2cea6146d8
|
Improve logging & add logit cap (#471)
|
2024-05-24 03:48:53 -07:00 |
|
Liangsheng Yin
|
9acc6e3504
|
add .isort.cfg (#378)
|
2024-04-22 22:38:09 +08:00 |
|
Lianmin Zheng
|
4aa5dd2c5f
|
Update version to v0.1.13 (#280)
|
2024-03-11 05:49:27 -07:00 |
|
Liangsheng Yin
|
1b35547927
|
Organize server_args (#277)
|
2024-03-11 20:06:52 +08:00 |
|
Liangsheng Yin
|
89885b31ef
|
Gemma Support (#256)
|
2024-03-11 12:14:27 +08:00 |
|
Cody Yu
|
26c3494152
|
[Submodule] Change FlashInfer to import (#156)
|
2024-02-06 19:28:29 -08:00 |
|
Lianmin Zheng
|
23f05005fd
|
Format code & move functions (#155)
|
2024-02-06 13:27:46 -08:00 |
|
Arcmoon
|
3ae78a09b3
|
Add gptq quantization model support (#141)
|
2024-02-06 11:35:04 -08:00 |
|
Arcmoon
|
63e97e5e4c
|
Suppport qwen model and solve some problems (#75)
|
2024-01-22 20:14:51 -08:00 |
|
Lianmin Zheng
|
22085081bb
|
release initial code
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com>
Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
|
2024-01-08 04:37:50 +00:00 |
|