Lianmin Zheng
|
f6af3a6561
|
Cleanup readme, llava examples, usage examples and nccl init (#1194)
|
2024-08-24 08:02:23 -07:00 |
|
intervitens
|
068e9eae55
|
Support min-p sampling (#1167)
|
2024-08-21 22:49:32 +00:00 |
|
Lianmin Zheng
|
57d0bd91ec
|
Improve benchmark (#1140)
|
2024-08-17 17:43:23 -07:00 |
|
Lianmin Zheng
|
87a0db82b8
|
update hyperparameter guide (#1114)
|
2024-08-15 10:54:24 -07:00 |
|
Lianmin Zheng
|
ad3e4f1619
|
Update the mixtral to use the better FusedMoE layer (#1081)
|
2024-08-13 15:44:25 -07:00 |
|
Yineng Zhang
|
89f23a5178
|
docs: update setup github runner (#1050)
|
2024-08-12 18:11:38 +10:00 |
|
Lianmin Zheng
|
54fb1c80c0
|
Clean up unit tests (#1020)
|
2024-08-10 15:09:03 -07:00 |
|
Juwan Yoo
|
ab7875941b
|
feat: frequency, min_new_tokens, presence, and repetition penalties (#973)
|
2024-08-08 04:21:08 -07:00 |
|
Ying Sheng
|
228cf47547
|
Create contributor_guide.md (#992)
|
2024-08-08 03:58:47 -07:00 |
|
foszto
|
c62d560c03
|
#590 Increase default , track changes in examples and documentation (#971)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-08-08 00:54:46 +00:00 |
|
Aidan Cooper
|
94e0115186
|
Feat: add alternative choices selection methods (#835)
|
2024-08-05 03:27:49 -07:00 |
|
Ying Sheng
|
975adb802b
|
Update hyperparameter_tuning.md (#918)
|
2024-08-04 13:51:52 -07:00 |
|
Ying Sheng
|
995af5a54b
|
Improve the structure of CI (#911)
|
2024-08-03 23:09:21 -07:00 |
|
min-xu-et
|
9319cd139c
|
[minor] fixed code formatting doc (#896)
|
2024-08-03 02:39:28 +10:00 |
|
Yineng Zhang
|
7937a886b2
|
docs: update setup runner (#884)
|
2024-08-02 21:03:53 +10:00 |
|
Liangsheng Yin
|
12ce3befb6
|
Update runner docs (#879)
|
2024-08-01 17:37:47 -07:00 |
|
Liangsheng Yin
|
70c78cfb03
|
Update runner docs (#876)
|
2024-08-01 15:32:33 -07:00 |
|
Ying Sheng
|
4075677621
|
Add OpenAI backend to the CI test (#869)
|
2024-08-01 09:25:24 -07:00 |
|
Ying Sheng
|
90286d8576
|
Add troubleshooting doc (#856)
|
2024-08-01 00:05:26 -07:00 |
|
Yineng Zhang
|
62c673c46f
|
docs: add set up runner (#829)
|
2024-07-30 19:43:40 +10:00 |
|
Liangsheng Yin
|
cdcbde5fc3
|
Code structure refactor (#807)
|
2024-07-29 23:04:48 -07:00 |
|
Ying Sheng
|
db6089e6f3
|
Revert "Organize public APIs" (#815)
|
2024-07-29 19:40:28 -07:00 |
|
Liangsheng Yin
|
c8e9fed87a
|
Organize public APIs (#809)
|
2024-07-29 15:34:16 -07:00 |
|
Ying Sheng
|
325a06c2de
|
Fix logging (#796)
|
2024-07-28 23:01:45 -07:00 |
|
Yineng Zhang
|
fa2aa0db0a
|
docs: update index (#786)
|
2024-07-28 17:22:00 +10:00 |
|
Yineng Zhang
|
6a387a69cc
|
fix: exclude logo png in gitignore (#785)
|
2024-07-28 17:08:16 +10:00 |
|
Yineng Zhang
|
27f5ce0a6c
|
fix: init readthedocs support (#784)
|
2024-07-28 16:55:54 +10:00 |
|
Yineng Zhang
|
948625799e
|
docs: init readthedocs support (#783)
|
2024-07-28 16:50:31 +10:00 |
|
Lianmin Zheng
|
30db99b3d9
|
Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776)
|
2024-07-27 19:50:34 -07:00 |
|
Yineng Zhang
|
c3c74bf874
|
docs: update model support (#760)
|
2024-07-27 14:07:37 +10:00 |
|
Liangsheng Yin
|
f424e76d96
|
Fix illegal tokens during sampling (#676)
|
2024-07-20 03:11:15 -07:00 |
|
Ying Sheng
|
06487f126e
|
refactor model loader: initial refactor (#664)
|
2024-07-20 02:18:22 -07:00 |
|
Ying Sheng
|
50a53887be
|
Update docs
|
2024-07-19 11:40:06 -07:00 |
|
Ying Sheng
|
e87c7fd501
|
Improve docs (#662)
|
2024-07-19 10:58:03 -07:00 |
|
Ying Sheng
|
51fda1439f
|
Update Readme (#660)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2024-07-19 09:54:01 -07:00 |
|
zhyncs
|
a8552cb18b
|
feat: support internlm2 (#636)
|
2024-07-16 22:40:03 -07:00 |
|
Lianmin Zheng
|
ce62dc73f0
|
Update model_support.md
|
2024-07-09 01:32:46 -07:00 |
|
Ying Sheng
|
5a57b8addd
|
Add Gemma2 (#592)
|
2024-07-05 09:48:54 -07:00 |
|
Lianmin Zheng
|
63fbef9876
|
fix flashinfer & http log level
|
2024-07-03 23:19:33 -07:00 |
|
Ying Sheng
|
2a754e57b0
|
2x performance improvement for large prefill & Fix workspace conflicts (#579)
|
2024-07-03 16:14:57 -07:00 |
|
Ying Sheng
|
9380f50ff9
|
Turn on flashinfer by default (#578)
|
2024-07-02 02:25:07 -07:00 |
|
Ying Sheng
|
75b31a2a88
|
Update run_batch interface and max_prefill_tokens (#574)
|
2024-06-30 18:26:04 -07:00 |
|
Lianmin Zheng
|
c0ae70c8ed
|
Improve logging & fix litellm dependency. (#512)
|
2024-06-07 13:10:32 -07:00 |
|
Lianmin Zheng
|
9f009261f2
|
Improve docs
|
2024-06-01 17:46:08 -05:00 |
|
Lianmin Zheng
|
159cc741e4
|
Make the server random by default (#493)
|
2024-05-31 23:33:34 -07:00 |
|
Lianmin Zheng
|
adc974268a
|
Update docs (#486)
|
2024-05-27 22:51:05 -07:00 |
|
Shannon Shen
|
04c0b21488
|
Allow input_ids in the input of the /generate endpoint (#363)
|
2024-05-12 15:29:00 -07:00 |
|
Lianmin Zheng
|
6e09cf6a15
|
Misc fixes (#432)
|
2024-05-12 15:05:40 -07:00 |
|
Liangsheng Yin
|
3842eba5fa
|
Logprobs Refractor (#331)
|
2024-03-28 14:34:49 +08:00 |
|
Lianmin Zheng
|
4aa5dd2c5f
|
Update version to v0.1.13 (#280)
|
2024-03-11 05:49:27 -07:00 |
|