Byron Hsu
|
6fcd6d7d6d
|
Support token ids in engine.generate (#1820)
|
2024-10-27 14:02:34 -07:00 |
|
Yineng Zhang
|
8bee20f80b
|
Update vllm to 0.6.3 (#1711) (#1720)
Co-authored-by: Ke Bao <ISPObaoke@163.com>
|
2024-10-19 20:45:41 -07:00 |
|
Byron Hsu
|
862cd265e5
|
[engine] support async and streaming (#1614)
|
2024-10-11 15:26:25 -07:00 |
|
Byron Hsu
|
551a3a9d38
|
Provide an offline engine API (#1567)
|
2024-10-06 20:27:03 -07:00 |
|
Byron Hsu
|
2422de5193
|
Support min_tokens in sgl.gen (#1573)
|
2024-10-05 21:51:12 -07:00 |
|
FredericOdermatt
|
2432ad40c6
|
[Minifix] Remove extra space in cot example (#1569)
|
2024-10-04 01:16:53 -07:00 |
|
Lianmin Zheng
|
114bbc8651
|
Use ipc instead of tcp in zmq (#1566)
|
2024-10-04 00:45:52 -07:00 |
|
Theresa Barton
|
2c7d0a5b8b
|
[Fix] Fix all the Huggingface paths (#1553)
|
2024-10-02 10:12:07 -07:00 |
|
Ying Sheng
|
0f4fb19bc8
|
[Fix, LoRA] fix LoRA with updates in main (#1545)
|
2024-09-30 10:06:08 -07:00 |
|
Lianmin Zheng
|
4e4459b91f
|
Multiple minor fixes (#1530)
|
2024-09-28 14:43:35 -07:00 |
|
Ying Sheng
|
9aa6553d2a
|
[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B (#1525)
|
2024-09-27 23:32:11 -07:00 |
|
Ying Sheng
|
e4780cf839
|
[API, Feature] Support response prefill for openai API (#1490)
|
2024-09-22 06:46:17 -07:00 |
|
Li Bo
|
446ea33277
|
fix: creat new dict everytime for putting new frame (#1464)
|
2024-09-19 01:31:48 -07:00 |
|
Ying Sheng
|
37963394aa
|
[Feature] Support LoRA path renaming and add LoRA serving benchmarks (#1433)
|
2024-09-15 12:46:04 -07:00 |
|
Lianmin Zheng
|
e4d68afcf0
|
[Minor] Many cleanup (#1357)
|
2024-09-09 04:14:11 -07:00 |
|
Kaichen Zhang - NTU
|
662ecd9368
|
[Feat] Add modalities for vision server when handling pixel values for llava (#1346)
|
2024-09-09 02:07:34 -07:00 |
|
Lianmin Zheng
|
f64eae3a29
|
[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping (#1308)
|
2024-09-02 21:44:45 -07:00 |
|
Kai-Hsun Chen
|
0836055324
|
[Chore] Rename model_overide_args to model_override_args (#1284)
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-09-01 03:14:56 -07:00 |
|
Lianmin Zheng
|
0a97d7962d
|
[Fix] Fix OOM in llava base class (#1249)
|
2024-08-28 08:45:49 -07:00 |
|
Lianmin Zheng
|
bf53bf5142
|
[Fix] Fix llava on multi images (#1247)
|
2024-08-28 06:33:05 -07:00 |
|
yichuan~
|
5ff25cdf5b
|
[Minor] add delete test and delete tmp file on ci server (#1227)
|
2024-08-26 22:04:52 -07:00 |
|
Kaichen Zhang - NTU
|
66e7dcaf70
|
[Fix] Fixing the multi-images error for llava-onevision (#1205)
|
2024-08-25 10:28:23 -07:00 |
|
Lianmin Zheng
|
f6af3a6561
|
Cleanup readme, llava examples, usage examples and nccl init (#1194)
|
2024-08-24 08:02:23 -07:00 |
|
Kaichen Zhang - NTU
|
a5b14ad043
|
[Feat/WIP] add llava-onevision, with support for (1) siglip encoder, (2) qwen2 decoder (3) openai api compatible server. (#1123)
Co-authored-by: Bo Li <drluodian@gmail.com>
|
2024-08-23 14:11:16 -07:00 |
|
Liangsheng Yin
|
83e23c69b3
|
Improve code style of sampler (#1168)
|
2024-08-21 16:48:24 -07:00 |
|
Liangsheng Yin
|
bb66cc4c52
|
Fix CI && python3.8 compatible (#920)
|
2024-08-04 16:02:05 -07:00 |
|
yichuan~
|
ca600e8cd6
|
Add support for logprobs in OpenAI chat API (#852)
|
2024-08-01 00:08:21 -07:00 |
|
yichuan~
|
bb0501c0d9
|
Fix List input bug (#838)
|
2024-07-30 13:40:51 -07:00 |
|
yichuan~
|
084fa54d37
|
Add support for OpenAI API : offline batch(file) processing (#699)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
|
2024-07-29 13:07:18 -07:00 |
|
Lianmin Zheng
|
30db99b3d9
|
Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776)
|
2024-07-27 19:50:34 -07:00 |
|
Ying Sheng
|
2b4c646277
|
Update version to 0.1.22 (#677)
|
2024-07-20 03:39:50 -07:00 |
|
yichuan~
|
49c5e0eca9
|
Add support for OpenAI API parallel sampling (#640)
|
2024-07-19 23:10:01 -07:00 |
|
zhyncs
|
2e341cd493
|
misc: add pre-commit config (#637)
|
2024-07-17 11:55:39 -07:00 |
|
胡译文
|
02b7258658
|
[Feat] Expose logprob options to sgl.gen API (#503)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2024-07-09 00:35:39 -07:00 |
|
Liangsheng Yin
|
9c902b1954
|
Decode Incrementally (#517)
|
2024-06-11 23:39:12 -07:00 |
|
Fabian Preiß
|
b6667a53b9
|
Fix RAG nb, parea setup (parea -> parea-ai) (#525)
|
2024-06-11 16:36:43 -07:00 |
|
Yuanhan Zhang
|
7d1ebc2d71
|
update the script: examples/usage/llava_video/srt_example_llava_v.sh (#491)
|
2024-05-31 23:31:56 -07:00 |
|
Li Bo
|
2b605ab1d7
|
[Feat/Fix] Refactoring Llava models into single file (#475)
|
2024-05-26 12:29:51 -07:00 |
|
Liangsheng Yin
|
f06e90c2cf
|
Optimize retract (#440)
|
2024-05-26 00:07:26 +08:00 |
|
bing
|
3167d8dabc
|
fix test bug in srt_llava_next_test.py (#470)
|
2024-05-24 03:38:01 -07:00 |
|
Lianmin Zheng
|
19d2135cb8
|
Use model loader from vllm (#459)
|
2024-05-21 09:13:37 -07:00 |
|
Lianmin Zheng
|
ced77c6626
|
Rename api_num_spec_tokens -> num_api_spec_tokens (#458)
|
2024-05-20 18:44:23 -07:00 |
|
Ying Sheng
|
3e684be7a3
|
Fix openai speculative execution (#456)
|
2024-05-20 17:01:13 -07:00 |
|
LiviaSun
|
ec380dfd30
|
openai chat speculative execution (#250)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-05-18 22:23:53 -07:00 |
|
Kaichen Zhang - NTU
|
664287b2a7
|
[Feat] Add llava qwen, llava mistral (#419)
Co-authored-by: Bo Li <drluodian@gmail.com>
|
2024-05-13 22:17:50 -07:00 |
|
Yuanhan Zhang
|
0992d85f92
|
support llava video (#426)
|
2024-05-13 16:57:00 -07:00 |
|
Lianmin Zheng
|
5dc55a5f02
|
Handle truncation errors (#436)
|
2024-05-13 15:56:00 -07:00 |
|
Joschka Braun
|
5c5aba5900
|
Adding RAG tracing & eval cookbook using Parea (#390)
|
2024-04-30 16:13:28 -07:00 |
|
Liangsheng Yin
|
3842eba5fa
|
Logprobs Refractor (#331)
|
2024-03-28 14:34:49 +08:00 |
|
Jani Monoses
|
64ee9c030e
|
Openrouter usage example (#327)
|
2024-03-23 10:16:24 -07:00 |
|