Commit Graph

88 Commits

Author SHA1 Message Date
Chang Su
27a009bb00 Fix ignore_eos parameter when loading a chat template (#5264) 2025-04-15 17:09:45 -07:00
Mick
e53a0b3d5b [fix] fix mrope positions not picked up (#5265) 2025-04-11 01:29:45 -07:00
Mick
5cb552b1d4 refactor: multimodal data (#4754) 2025-03-31 09:57:51 -07:00
Lianmin Zheng
47e6628aae Fix CI tests (#4853) 2025-03-28 00:28:35 -07:00
BroadbentJim
550586ef42 fix: Inappropriate lack of Optional type on OpenAI ChatCompletionRequest (#4681) 2025-03-27 22:19:05 -07:00
lambert0312
2e0f94ab79 [Fix] fix output_top_logprobs is not exist (#4597) 2025-03-27 21:45:57 -07:00
Jon Durbin
04eb6062e4 Include context length in /v1/models response. (#4809) 2025-03-27 20:23:18 -07:00
Pan Lyu
c913ed4046 support clip embedding model (#4506) 2025-03-27 00:18:15 -07:00
Xihuai Wang
1afe3d0798 Align finish reason and stream mode in openai api (#4388) 2025-03-27 00:16:52 -07:00
DarkSharpness
ac3fae8445 [Feature] Support "strict" in function calling (#4310) 2025-03-24 22:15:25 -07:00
Mick
1e86457c90 model: Minicpmo (#3023) 2025-03-24 20:08:40 -07:00
mlmz
f6ab4ca6bc fix: fix ipython running error for Engine due to outlines nest_asyncio (#4582)
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
2025-03-21 19:11:15 -07:00
Yuhong Guo
417fc72f6f Align completion and chat_completion response to OpenAI API (#4637) 2025-03-20 22:59:04 -07:00
Xihuai Wang
927ca935a7 Constraint Decoding: Tool call with text (#4067) 2025-03-17 01:06:46 -07:00
mlmz
452db50808 Constraint Decoding: Set xgrammar as the default grammar backend (#4386) 2025-03-16 18:53:43 -07:00
woodx
48efec7b05 Feature: support code completion (#3612) 2025-03-16 18:26:19 -07:00
Chang Su
5fe79605a8 Fix Llama3.3 tool call support (#4320) 2025-03-13 14:01:41 -07:00
Wen Sun
4014804157 Ensure Usage Data in Streaming Responses Aligns with vLLM’s Implementation (#3814) 2025-03-12 22:12:55 -07:00
David Carreto Fidalgo
f7f88b706c HotFix: json serialization error when using OAI v1/batches endpoint with logprobs (#3896) 2025-03-12 22:04:29 -07:00
Conghui Tan
6412c5e493 Avoid duplicated request ids in batch APIs (#4026)
Co-authored-by: conghuitan <conghuitan@tencent.com>
2025-03-12 21:38:17 -07:00
Pan Lyu
361971b859 Add Support for Qwen2-VL Multi-modal Embedding Models (#3694) 2025-03-06 16:46:20 -08:00
Xihuai Wang
95575aa76a Reasoning parser (#4000)
Co-authored-by: Lucas Pickup <lupickup@microsoft.com>
2025-03-03 21:16:36 -08:00
Lianmin Zheng
935cda944b Misc clean up; Remove the support of jump forward (#4032) 2025-03-03 07:02:14 -08:00
mlmz
bac414ab53 [Feature] integrate Structural Tag in xgrammar backend for function calling (#3566)
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
2025-02-27 23:33:41 -08:00
Lianmin Zheng
f2388f6b95 Revert "Rename TokenizerManager to StdOrchestrator" (#3828) 2025-02-24 14:47:59 -08:00
fzyzcjy
45360b2fa9 Improve: Rename TokenizerManager to StdOrchestrator (#3116) 2025-02-23 00:30:58 -08:00
Shi Shuai
c7c79b16cd [Fix] OpenAI API adapter tokenizer encoding (#3432) 2025-02-21 09:24:15 -08:00
Mick
7711ac6ed0 doc: emphasize and notify the usage of chat_template (#3589)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
2025-02-15 00:10:32 -08:00
YAMY
b045841bae Feature/function calling update (#2700)
Co-authored-by: Mingyuan Ma <mamingyuan2001@berkeley.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
2025-01-26 09:57:51 -08:00
Lianmin Zheng
bc6915e3b9 Improve type annotation and styles (#2926) 2025-01-16 12:51:11 -08:00
Ying Sheng
dc7eb01f19 [Fix] fix openai adapter (#2685) 2024-12-31 10:48:19 +00:00
Lianmin Zheng
8c3b420eec [Docs] clean up structured outputs docs (#2654) 2024-12-29 23:57:16 -08:00
Tanjiro
8ee9a8501a [Feature] Function Calling (#2544)
Co-authored-by: Haoyu Wang <120358163+HaoyuWang4188@users.noreply.github.com>
2024-12-28 21:58:52 -08:00
Adarsh Shirawalmath
acb340728c [Feature] Support new parameter - EBNF in xgrammar (#2526) 2024-12-26 05:12:41 -08:00
Lei
19ba2b0ea9 Add lora_paths to v1_chat_generate_request (#2529) 2024-12-22 02:23:33 -08:00
Lianmin Zheng
361ea8d912 Fix openai protocols and pass top_k, min_p (#2499) 2024-12-17 04:14:14 -08:00
Lei
33c5ff2845 Add lora_path to chat completion (#2438) 2024-12-17 03:47:49 -08:00
Lianmin Zheng
5c18a03733 Fix logprob for completions (#2301) 2024-12-01 05:17:05 -08:00
bjmsong
01017d4c20 Support LoRA in Completion API (#2243)
Co-authored-by: root <bjmsong@126.com>
2024-11-29 16:13:38 -08:00
Baoyuan Qi
a4fd2f9b46 fix typo prompts (#2224) 2024-11-27 12:07:00 -08:00
Xuehai Pan
62a4a339eb docs: fix module docstrings and copyright headers (#2077) 2024-11-22 22:16:53 +08:00
Alexander Waitz
929c7621af Fix: incorrect top_logprobs in chat completion (#2088) 2024-11-19 12:21:36 +00:00
yukavio
2a3992b6f1 support set role as 'tool' (#2075)
Co-authored-by: kavioyu <kavioyu@tencent.com>
2024-11-18 01:06:59 -08:00
Lianmin Zheng
ea53c63bad Expose no_stop_trim and skip_special_tokens in openai api (#2039) 2024-11-14 19:09:21 -08:00
chottolabs
fb9fb3518b set content to empty string (#2026) 2024-11-14 01:06:02 +00:00
Xiaoyu Zhang
a1bd719031 fix a bug in v1_embeeding_request (#2014) 2024-11-12 16:49:45 +08:00
Xiaoyu Zhang
027e65248f support echo=true and logprobs in openai api when logprobs=1 in lm-evaluation-harness (#1998) 2024-11-11 23:21:20 -08:00
Lianmin Zheng
c17c578108 Simplify tokenizer manager (#1904) 2024-11-03 08:38:26 -08:00
Gleb Drozdov
a95d5589c3 Add matched_stop token or str to distinguish between eos or stop str finish_reason generation (#1684) 2024-10-17 18:06:52 +00:00
Michael Feil
b0facb3316 add orjson for jsonresponse (#1688) 2024-10-16 18:14:30 -07:00