Lianmin Zheng
|
3f0fe08d37
|
Let ModelRunner take InputMetadata as input, instead of ScheduleBatch (#1541)
|
2024-09-29 20:28:45 -07:00 |
|
Ying Sheng
|
9aa6553d2a
|
[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B (#1525)
|
2024-09-27 23:32:11 -07:00 |
|
TianyiQ
|
3c93187caf
|
Add support for tie_word_embeddings when loading weights + support for SmolLM (#1508)
|
2024-09-24 21:50:20 -07:00 |
|
Lianmin Zheng
|
fb2d0680e0
|
[Fix] Fix clean_up_tokenization_spaces in tokenizer (#1510)
|
2024-09-24 21:37:33 -07:00 |
|
Lianmin Zheng
|
167591e864
|
Better unit tests for adding a new model (#1488)
|
2024-09-22 01:50:37 -07:00 |
|
Ying Sheng
|
712216928f
|
[Feature] Initial support for multi-LoRA serving (#1307)
|
2024-09-12 16:46:14 -07:00 |
|
Ying Sheng
|
689ff588ec
|
[CI] Return output logprobs in unit test (#1361)
|
2024-09-09 13:05:13 -07:00 |
|
Yineng Zhang
|
c411f32e1c
|
feat: replace GeluAndMul (#1234)
|
2024-08-28 14:07:02 +00:00 |
|
Yineng Zhang
|
66975360e7
|
fix: increase max_new_tokens when testing generation models (#1244)
|
2024-08-28 22:12:36 +10:00 |
|
Mingyi
|
7514b9f8d3
|
[CI] Fix CI (#1217)
|
2024-08-26 02:56:42 +00:00 |
|
Ying Sheng
|
308d024092
|
[CI] Fix the issue of unit test hanging (#1211)
|
2024-08-25 16:21:37 -07:00 |
|
Chayenne
|
30b4f771b0
|
Support Alibaba-NLP/gte-Qwen2-7B-instruct embedding Model (#1186)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-08-25 10:29:12 -07:00 |
|
Ying Sheng
|
1cb4da5c5f
|
[Fix] the issue of random order when input is a list (#1199)
|
2024-08-24 21:43:03 -07:00 |
|
Lianmin Zheng
|
f6af3a6561
|
Cleanup readme, llava examples, usage examples and nccl init (#1194)
|
2024-08-24 08:02:23 -07:00 |
|
Ying Sheng
|
0909bb0d2f
|
[Feat] Add window attention for gemma-2 (#1056)
|
2024-08-13 17:01:26 -07:00 |
|
Ying Sheng
|
32f6144323
|
fix: Fix returned prefill logits and add output str test (#1046)
|
2024-08-12 06:13:45 +00:00 |
|
Ying Sheng
|
b68c4c073b
|
fix: force max new tokens to be 1 for embedding request (#1019)
|
2024-08-10 13:46:42 -07:00 |
|
Ying Sheng
|
e040a2450b
|
Add e5-mistral embedding model - step 3/3 (#988)
|
2024-08-08 16:31:19 -07:00 |
|
Yineng Zhang
|
c31f084c71
|
chore: update vllm to 0.5.4 (#966)
|
2024-08-07 21:15:41 +10:00 |
|
Ying Sheng
|
995af5a54b
|
Improve the structure of CI (#911)
|
2024-08-03 23:09:21 -07:00 |
|
Ying Sheng
|
70cc0749ce
|
Add model accuracy test - step 1 (#866)
|
2024-08-03 18:20:50 -07:00 |
|