Yineng Zhang
|
85e1a6f3aa
|
Update model_loader deps and qqq quantization deps (#2220) (#2318)
Co-authored-by: HandH1998 <1335248067@qq.com>
|
2024-12-02 23:22:13 +08:00 |
|
Lianmin Zheng
|
4936be8acc
|
Revert "Revert "[FEAT] Support GGUF format"" (#2287)
|
2024-11-30 22:14:48 -08:00 |
|
Lianmin Zheng
|
7e4c6dd8da
|
Revert "[FEAT] Support GGUF format" (#2285)
|
2024-11-30 19:03:26 -08:00 |
|
Yang Zheng
|
883c955489
|
[FEAT] Support GGUF format (#2215)
Co-authored-by: Yang Zheng(SW)(Alex) <you@example.com>
|
2024-11-30 00:44:48 -08:00 |
|
Xuehai Pan
|
62a4a339eb
|
docs: fix module docstrings and copyright headers (#2077)
|
2024-11-22 22:16:53 +08:00 |
|
Lianmin Zheng
|
2ce32db6fb
|
Let reward model take text inputs instead of message lists (#1907)
Co-authored-by: Kyle Corbitt <kyle@corbt.com>
|
2024-11-03 13:27:12 -08:00 |
|
Ran Chen
|
146f613405
|
Fix incorrect context length for llama3.2-11b (#1873)
|
2024-11-02 00:04:50 -07:00 |
|
Hui Liu
|
9ce8e1a93c
|
move max_position_embeddings to the last (#1799)
|
2024-10-25 19:30:50 -07:00 |
|
Lianmin Zheng
|
8f8f96a621
|
Fix the perf regression due to additional_stop_token_ids (#1773)
|
2024-10-23 16:45:21 -07:00 |
|
Lianmin Zheng
|
0d800090b4
|
Fix missing additional_stop_token_ids (#1769)
|
2024-10-23 12:18:59 -07:00 |
|
Lianmin Zheng
|
80a905475d
|
Fix stop condition for <|eom_id|> (#1766)
|
2024-10-23 10:47:12 -07:00 |
|
Yineng Zhang
|
cbbc82b7b8
|
Support qwen2 vl model (#1721)
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: ispobock <ISPObaoke@163.com>
|
2024-10-19 21:44:38 -07:00 |
|
Lianmin Zheng
|
fb2d0680e0
|
[Fix] Fix clean_up_tokenization_spaces in tokenizer (#1510)
|
2024-09-24 21:37:33 -07:00 |
|
Lianmin Zheng
|
3a6e8b6d78
|
[Minor] move triton attention kernels into a separate folder (#1379)
|
2024-09-10 15:15:08 -07:00 |
|
Jani Monoses
|
474317f2b6
|
Support Phi3 mini and medium (#1299)
|
2024-09-02 21:49:40 -07:00 |
|
Kai-Hsun Chen
|
0836055324
|
[Chore] Rename model_overide_args to model_override_args (#1284)
Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-09-01 03:14:56 -07:00 |
|
Lianmin Zheng
|
79ece2c51f
|
Report median instead of mean in bench_latency.py (#1269)
|
2024-08-30 06:05:01 -07:00 |
|
김종곤
|
b7f8341014
|
EXAONE 3.0 Model Support (#1258)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-08-30 08:08:28 +00:00 |
|
Lianmin Zheng
|
bf53bf5142
|
[Fix] Fix llava on multi images (#1247)
|
2024-08-28 06:33:05 -07:00 |
|
Lianmin Zheng
|
902278008a
|
[Minor] Improve the function organization in TokenizerManager & improve loggers (#1208)
|
2024-08-25 14:46:34 -07:00 |
|
Lianmin Zheng
|
bea2bb9eea
|
Improve multi-node stability (#1171)
|
2024-08-20 22:35:05 -07:00 |
|
Lianmin Zheng
|
a8ae640328
|
Improve docs and warnings (#1164)
|
2024-08-20 08:31:29 -07:00 |
|
Lianmin Zheng
|
3c1f5a9220
|
Fix duplicated imports in hf_transformers_utils.py (#1141)
|
2024-08-17 18:03:00 -07:00 |
|
Lianmin Zheng
|
57d0bd91ec
|
Improve benchmark (#1140)
|
2024-08-17 17:43:23 -07:00 |
|
Liangsheng Yin
|
bb66cc4c52
|
Fix CI && python3.8 compatible (#920)
|
2024-08-04 16:02:05 -07:00 |
|
Yineng Zhang
|
dd7e8b9421
|
chore: add copyright for srt (#790)
|
2024-07-28 23:07:12 +10:00 |
|
Liangsheng Yin
|
d9fccfefe2
|
Fix context length (#757)
|
2024-07-26 18:13:13 -07:00 |
|
Liangsheng Yin
|
679ebcbbdc
|
Deepseek v2 support (#693)
|
2024-07-26 17:10:07 -07:00 |
|
Ying Sheng
|
444a02441a
|
Update vllm version to support llama3.1 (#705)
|
2024-07-23 13:49:34 -07:00 |
|
Ke Bao
|
824a77d04d
|
Fix hf config loading (#702)
|
2024-07-23 11:39:08 -07:00 |
|
Ying Sheng
|
dc1b8bcfaa
|
Format (#593)
|
2024-07-05 10:06:17 -07:00 |
|
Lianmin Zheng
|
2e6e62e156
|
Increase the number of thread limitation for tp worker managers. (#567)
|
2024-06-26 09:33:45 -07:00 |
|
Ying Sheng
|
fb9296f0ed
|
Higher priority for user input of max_prefill_tokens & format (#540)
|
2024-06-12 21:48:40 -07:00 |
|
Lianmin Zheng
|
3bc01ac137
|
[Minor] improve code style
|
2024-06-03 18:11:34 -07:00 |
|
Lianmin Zheng
|
55c1643627
|
Improve benchmark scripts & rename some scripts (#477)
|
2024-05-26 12:51:45 -07:00 |
|
Lianmin Zheng
|
2cea6146d8
|
Improve logging & add logit cap (#471)
|
2024-05-24 03:48:53 -07:00 |
|
Yuanhan Zhang
|
0992d85f92
|
support llava video (#426)
|
2024-05-13 16:57:00 -07:00 |
|
Liangsheng Yin
|
150d7020ed
|
Revert removing the unused imports (#385)
|
2024-04-23 22:36:33 +08:00 |
|
Liangsheng Yin
|
9acc6e3504
|
add .isort.cfg (#378)
|
2024-04-22 22:38:09 +08:00 |
|
Lianmin Zheng
|
22085081bb
|
release initial code
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com>
Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
|
2024-01-08 04:37:50 +00:00 |
|