Commit Graph

79 Commits

Author SHA1 Message Date
Lianmin Zheng
048685430d Improve process creation (#1534) 2024-09-29 02:36:12 -07:00
Lianmin Zheng
4e4459b91f Multiple minor fixes (#1530) 2024-09-28 14:43:35 -07:00
Ying Sheng
6f3cf1297e [CI, AMD] Add AMD tests to CI (#1491) 2024-09-22 04:45:10 -07:00
Lianmin Zheng
167591e864 Better unit tests for adding a new model (#1488) 2024-09-22 01:50:37 -07:00
Yineng Zhang
441c22db8c doc: update backend (#1486) 2024-09-21 22:05:12 +08:00
Ran Chen
ce636ac441 fix incorrect links in documentation (#1481)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-09-21 20:36:23 +08:00
Lianmin Zheng
7f24ea95c3 Fuse top_k and top_k in the sampler (#1457) 2024-09-18 04:35:35 -07:00
Ke Bao
c6b6d2e71b Enable MLA by default (#1447) 2024-09-17 11:42:48 +00:00
Lianmin Zheng
e79f6cd73d Release v0.3.1 (#1430) 2024-09-15 23:03:16 +09:00
Lianmin Zheng
9ba1f09760 [Fix] Fix logprob and normalized_logprob (#1428) 2024-09-15 06:36:06 -07:00
Lianmin Zheng
282681b8a1 Update backend.md (#1429) 2024-09-15 02:55:34 -07:00
Lianmin Zheng
46094e0c1b Deprecate --disable-flashinfer and introduce --attention-backend (#1380) 2024-09-10 17:11:16 -07:00
Lianmin Zheng
8d1095dbf0 [Docs] Improve documentations (#1368) 2024-09-09 20:48:28 -07:00
Chayenne
743007e1ce Adding Documentation for installation (#1300)
Co-authored-by: zhaochen20 <zhaochenyang20@gmail.com>
2024-09-09 19:09:13 -07:00
Lianmin Zheng
f64eae3a29 [Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping (#1308) 2024-09-02 21:44:45 -07:00
havetc
9935f97b3e [FEAT] JSON constrained support (#1125)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-08-26 09:37:26 -07:00
Lianmin Zheng
d3efcb3930 Update workflow files (#1214) 2024-08-25 17:45:35 -07:00
Lianmin Zheng
61bb223e0f Update CI runner docs (#1213) 2024-08-25 17:31:52 -07:00
Lianmin Zheng
902278008a [Minor] Improve the function organization in TokenizerManager & improve loggers (#1208) 2024-08-25 14:46:34 -07:00
Lianmin Zheng
f6af3a6561 Cleanup readme, llava examples, usage examples and nccl init (#1194) 2024-08-24 08:02:23 -07:00
intervitens
068e9eae55 Support min-p sampling (#1167) 2024-08-21 22:49:32 +00:00
Lianmin Zheng
57d0bd91ec Improve benchmark (#1140) 2024-08-17 17:43:23 -07:00
Lianmin Zheng
87a0db82b8 update hyperparameter guide (#1114) 2024-08-15 10:54:24 -07:00
Lianmin Zheng
ad3e4f1619 Update the mixtral to use the better FusedMoE layer (#1081) 2024-08-13 15:44:25 -07:00
Yineng Zhang
89f23a5178 docs: update setup github runner (#1050) 2024-08-12 18:11:38 +10:00
Lianmin Zheng
54fb1c80c0 Clean up unit tests (#1020) 2024-08-10 15:09:03 -07:00
Juwan Yoo
ab7875941b feat: frequency, min_new_tokens, presence, and repetition penalties (#973) 2024-08-08 04:21:08 -07:00
Ying Sheng
228cf47547 Create contributor_guide.md (#992) 2024-08-08 03:58:47 -07:00
foszto
c62d560c03 #590 Increase default , track changes in examples and documentation (#971)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-08-08 00:54:46 +00:00
Aidan Cooper
94e0115186 Feat: add alternative choices selection methods (#835) 2024-08-05 03:27:49 -07:00
Ying Sheng
975adb802b Update hyperparameter_tuning.md (#918) 2024-08-04 13:51:52 -07:00
Ying Sheng
995af5a54b Improve the structure of CI (#911) 2024-08-03 23:09:21 -07:00
min-xu-et
9319cd139c [minor] fixed code formatting doc (#896) 2024-08-03 02:39:28 +10:00
Yineng Zhang
7937a886b2 docs: update setup runner (#884) 2024-08-02 21:03:53 +10:00
Liangsheng Yin
12ce3befb6 Update runner docs (#879) 2024-08-01 17:37:47 -07:00
Liangsheng Yin
70c78cfb03 Update runner docs (#876) 2024-08-01 15:32:33 -07:00
Ying Sheng
4075677621 Add OpenAI backend to the CI test (#869) 2024-08-01 09:25:24 -07:00
Ying Sheng
90286d8576 Add troubleshooting doc (#856) 2024-08-01 00:05:26 -07:00
Yineng Zhang
62c673c46f docs: add set up runner (#829) 2024-07-30 19:43:40 +10:00
Liangsheng Yin
cdcbde5fc3 Code structure refactor (#807) 2024-07-29 23:04:48 -07:00
Ying Sheng
db6089e6f3 Revert "Organize public APIs" (#815) 2024-07-29 19:40:28 -07:00
Liangsheng Yin
c8e9fed87a Organize public APIs (#809) 2024-07-29 15:34:16 -07:00
Ying Sheng
325a06c2de Fix logging (#796) 2024-07-28 23:01:45 -07:00
Yineng Zhang
fa2aa0db0a docs: update index (#786) 2024-07-28 17:22:00 +10:00
Yineng Zhang
6a387a69cc fix: exclude logo png in gitignore (#785) 2024-07-28 17:08:16 +10:00
Yineng Zhang
27f5ce0a6c fix: init readthedocs support (#784) 2024-07-28 16:55:54 +10:00
Yineng Zhang
948625799e docs: init readthedocs support (#783) 2024-07-28 16:50:31 +10:00
Lianmin Zheng
30db99b3d9 Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776) 2024-07-27 19:50:34 -07:00
Yineng Zhang
c3c74bf874 docs: update model support (#760) 2024-07-27 14:07:37 +10:00
Liangsheng Yin
f424e76d96 Fix illegal tokens during sampling (#676) 2024-07-20 03:11:15 -07:00