Commit Graph

56 Commits

Author SHA1 Message Date
Yineng Zhang
b146555749 Revert "Implement return_hidden_states for the OpenAI API (#6137)" (#6440) 2025-05-19 18:21:29 -07:00
kyle-pena-kuzco
4f39bcf7ab Implement return_hidden_states for the OpenAI API (#6137) 2025-05-18 22:30:25 -07:00
Ravi Theja
41a645f556 Handle empty input string for embedding models (#5621)
Co-authored-by: Ravi Theja Desetty <ravitheja@Ravis-MacBook-Pro.local>
2025-05-11 23:17:15 +08:00
applesaucethebun
2ce8793519 Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-11 12:55:00 +08:00
XinyuanTong
9d8ec2e67e Fix and Clean up chat-template requirement for VLM (#6114)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-05-11 00:14:09 +08:00
Adarsh Shirawalmath
8b39274e34 [Feature] Prefill assistant response - add continue_final_message parameter (#4226)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
2025-04-20 17:37:18 -07:00
Chang Su
27a009bb00 Fix ignore_eos parameter when loading a chat template (#5264) 2025-04-15 17:09:45 -07:00
Jon Durbin
04eb6062e4 Include context length in /v1/models response. (#4809) 2025-03-27 20:23:18 -07:00
Xihuai Wang
1afe3d0798 Align finish reason and stream mode in openai api (#4388) 2025-03-27 00:16:52 -07:00
fzyzcjy
15ddd84322 Add retry for flaky tests in CI (#4755) 2025-03-25 16:53:12 -07:00
Xihuai Wang
927ca935a7 Constraint Decoding: Tool call with text (#4067) 2025-03-17 01:06:46 -07:00
YAMY
b045841bae Feature/function calling update (#2700)
Co-authored-by: Mingyuan Ma <mamingyuan2001@berkeley.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
2025-01-26 09:57:51 -08:00
Chang Su
f290bd4332 [Bugfix] Fix embedding model hangs with --enable-metrics (#2822) 2025-01-10 13:14:51 -08:00
Tanjiro
8ee9a8501a [Feature] Function Calling (#2544)
Co-authored-by: Haoyu Wang <120358163+HaoyuWang4188@users.noreply.github.com>
2024-12-28 21:58:52 -08:00
Adarsh Shirawalmath
acb340728c [Feature] Support new parameter - EBNF in xgrammar (#2526) 2024-12-26 05:12:41 -08:00
Lianmin Zheng
d4fc1a70e3 Crash the server correctly during error (#2231) 2024-11-28 00:22:39 -08:00
Xiaoyu Zhang
027e65248f support echo=true and logprobs in openai api when logprobs=1 in lm-evaluation-harness (#1998) 2024-11-11 23:21:20 -08:00
Lianmin Zheng
9c939a3d8b Clean up metrics code (#1972) 2024-11-09 15:43:20 -08:00
Chayenne
c77c1e05ba fix black in pre-commit (#1940) 2024-11-08 07:42:47 +08:00
Lianmin Zheng
c17c578108 Simplify tokenizer manager (#1904) 2024-11-03 08:38:26 -08:00
Lianmin Zheng
86fc0d79d0 Add a watch dog thread (#1816) 2024-10-27 02:00:50 -07:00
Theresa Barton
2c7d0a5b8b [Fix] Fix all the Huggingface paths (#1553) 2024-10-02 10:12:07 -07:00
Ying Sheng
e4780cf839 [API, Feature] Support response prefill for openai API (#1490) 2024-09-22 06:46:17 -07:00
Lianmin Zheng
9ba1f09760 [Fix] Fix logprob and normalized_logprob (#1428) 2024-09-15 06:36:06 -07:00
yichuan~
5ff25cdf5b [Minor] add delete test and delete tmp file on ci server (#1227) 2024-08-26 22:04:52 -07:00
caiyueliang
2f1d92834f [FEAT] Support batches cancel (#1222)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-08-26 23:28:26 +00:00
Mingyi
158e8f1e2d improve the threshold and ports in tests (#1215) 2024-08-25 19:02:08 -07:00
Juwan Yoo
d8476818ef feat: allow streaming for multi-prompt and/or parallel sampling (#1134) 2024-08-20 08:06:55 -07:00
yichuan~
b997a18d74 [Feat]Add support for optional start len of logprobs (#1035)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
2024-08-18 23:45:41 -07:00
Yineng Zhang
f7fb68d292 ci: add moe test (#1053) 2024-08-13 18:43:23 +10:00
Lianmin Zheng
8207637029 Improve end-to-end throughput test and its coverage (#1039) 2024-08-11 18:27:33 -07:00
Lianmin Zheng
54fb1c80c0 Clean up unit tests (#1020) 2024-08-10 15:09:03 -07:00
yichuan~
3a79613c28 support more optioin about usage in stream mode (#985)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-08-08 09:41:57 +00:00
Yineng Zhang
c31f084c71 chore: update vllm to 0.5.4 (#966) 2024-08-07 21:15:41 +10:00
yichuan~
5f6fa04a3f misc: simplify test (#964) 2024-08-07 01:23:27 -07:00
yichuan~
795eab6dda Add support for Batch API test (#936) 2024-08-06 23:52:10 -07:00
yichuan~
fd7926e46e Fix prompt len in parallel sampling (#928) 2024-08-05 00:56:08 -07:00
Ying Sheng
0a4f5f9bea Test regex in vision api (#926) 2024-08-04 22:52:41 -07:00
Ying Sheng
3bc99e6fe4 Test openai vision api (#925) 2024-08-05 13:51:55 +10:00
yichuan~
d53dcf9c98 Support more OpenAI API test (#916) 2024-08-04 16:43:09 -07:00
Liangsheng Yin
bb66cc4c52 Fix CI && python3.8 compatible (#920) 2024-08-04 16:02:05 -07:00
Ying Sheng
0d4f3a9fcd Make API Key OpenAI-compatible (#917) 2024-08-04 13:35:44 -07:00
Ying Sheng
995af5a54b Improve the structure of CI (#911) 2024-08-03 23:09:21 -07:00
Ying Sheng
ae7ee01a8e Add accuracy test to CI: MMLU (#882) 2024-08-01 21:20:17 -07:00
Ying Sheng
60340a3643 Improve the coverage of the openai api server test (#878) 2024-08-01 16:01:30 -07:00
Ying Sheng
72b6ea88b4 Make scripts under /test/srt as unit tests (#875) 2024-08-01 14:34:55 -07:00
Lianmin Zheng
6e09cf6a15 Misc fixes (#432) 2024-05-12 15:05:40 -07:00
Lianmin Zheng
aee4f523cf Fix logit processor bugs (#427) 2024-05-12 04:54:07 -07:00
Liangsheng Yin
3842eba5fa Logprobs Refractor (#331) 2024-03-28 14:34:49 +08:00
Lianmin Zheng
c51020cf0c Fix the chat template for llava-v1.6-34b & format code (#177) 2024-02-11 05:50:13 -08:00