Commit Graph

21 Commits

Author SHA1 Message Date
Lianmin Zheng
3f0fe08d37 Let ModelRunner take InputMetadata as input, instead of ScheduleBatch (#1541) 2024-09-29 20:28:45 -07:00
Ying Sheng
9aa6553d2a [Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B (#1525) 2024-09-27 23:32:11 -07:00
TianyiQ
3c93187caf Add support for tie_word_embeddings when loading weights + support for SmolLM (#1508) 2024-09-24 21:50:20 -07:00
Lianmin Zheng
fb2d0680e0 [Fix] Fix clean_up_tokenization_spaces in tokenizer (#1510) 2024-09-24 21:37:33 -07:00
Lianmin Zheng
167591e864 Better unit tests for adding a new model (#1488) 2024-09-22 01:50:37 -07:00
Ying Sheng
712216928f [Feature] Initial support for multi-LoRA serving (#1307) 2024-09-12 16:46:14 -07:00
Ying Sheng
689ff588ec [CI] Return output logprobs in unit test (#1361) 2024-09-09 13:05:13 -07:00
Yineng Zhang
c411f32e1c feat: replace GeluAndMul (#1234) 2024-08-28 14:07:02 +00:00
Yineng Zhang
66975360e7 fix: increase max_new_tokens when testing generation models (#1244) 2024-08-28 22:12:36 +10:00
Mingyi
7514b9f8d3 [CI] Fix CI (#1217) 2024-08-26 02:56:42 +00:00
Ying Sheng
308d024092 [CI] Fix the issue of unit test hanging (#1211) 2024-08-25 16:21:37 -07:00
Chayenne
30b4f771b0 Support Alibaba-NLP/gte-Qwen2-7B-instruct embedding Model (#1186)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-08-25 10:29:12 -07:00
Ying Sheng
1cb4da5c5f [Fix] the issue of random order when input is a list (#1199) 2024-08-24 21:43:03 -07:00
Lianmin Zheng
f6af3a6561 Cleanup readme, llava examples, usage examples and nccl init (#1194) 2024-08-24 08:02:23 -07:00
Ying Sheng
0909bb0d2f [Feat] Add window attention for gemma-2 (#1056) 2024-08-13 17:01:26 -07:00
Ying Sheng
32f6144323 fix: Fix returned prefill logits and add output str test (#1046) 2024-08-12 06:13:45 +00:00
Ying Sheng
b68c4c073b fix: force max new tokens to be 1 for embedding request (#1019) 2024-08-10 13:46:42 -07:00
Ying Sheng
e040a2450b Add e5-mistral embedding model - step 3/3 (#988) 2024-08-08 16:31:19 -07:00
Yineng Zhang
c31f084c71 chore: update vllm to 0.5.4 (#966) 2024-08-07 21:15:41 +10:00
Ying Sheng
995af5a54b Improve the structure of CI (#911) 2024-08-03 23:09:21 -07:00
Ying Sheng
70cc0749ce Add model accuracy test - step 1 (#866) 2024-08-03 18:20:50 -07:00