sglang

Author	SHA1	Message	Date
Chang Su	27a009bb00	Fix ignore_eos parameter when loading a chat template (#5264 )	2025-04-15 17:09:45 -07:00
Mick	e53a0b3d5b	[fix] fix mrope positions not picked up (#5265 )	2025-04-11 01:29:45 -07:00
Mick	5cb552b1d4	refactor: multimodal data (#4754 )	2025-03-31 09:57:51 -07:00
Lianmin Zheng	47e6628aae	Fix CI tests (#4853 )	2025-03-28 00:28:35 -07:00
BroadbentJim	550586ef42	fix: Inappropriate lack of Optional type on OpenAI ChatCompletionRequest (#4681 )	2025-03-27 22:19:05 -07:00
lambert0312	2e0f94ab79	[Fix] fix output_top_logprobs is not exist (#4597 )	2025-03-27 21:45:57 -07:00
Jon Durbin	04eb6062e4	Include context length in /v1/models response. (#4809 )	2025-03-27 20:23:18 -07:00
Pan Lyu	c913ed4046	support clip embedding model (#4506 )	2025-03-27 00:18:15 -07:00
Xihuai Wang	1afe3d0798	Align finish reason and stream mode in openai api (#4388 )	2025-03-27 00:16:52 -07:00
DarkSharpness	ac3fae8445	[Feature] Support "strict" in function calling (#4310 )	2025-03-24 22:15:25 -07:00
Mick	1e86457c90	model: Minicpmo (#3023 )	2025-03-24 20:08:40 -07:00
mlmz	f6ab4ca6bc	fix: fix ipython running error for Engine due to outlines nest_asyncio (#4582 ) Co-authored-by: shuaills <shishuaiuoe@gmail.com>	2025-03-21 19:11:15 -07:00
Yuhong Guo	417fc72f6f	Align completion and chat_completion response to OpenAI API (#4637 )	2025-03-20 22:59:04 -07:00
Xihuai Wang	927ca935a7	Constraint Decoding: Tool call with text (#4067 )	2025-03-17 01:06:46 -07:00
mlmz	452db50808	Constraint Decoding: Set xgrammar as the default grammar backend (#4386 )	2025-03-16 18:53:43 -07:00
woodx	48efec7b05	Feature: support code completion (#3612 )	2025-03-16 18:26:19 -07:00
Chang Su	5fe79605a8	Fix Llama3.3 tool call support (#4320 )	2025-03-13 14:01:41 -07:00
Wen Sun	4014804157	Ensure Usage Data in Streaming Responses Aligns with vLLM’s Implementation (#3814 )	2025-03-12 22:12:55 -07:00
David Carreto Fidalgo	f7f88b706c	HotFix: json serialization error when using OAI v1/batches endpoint with logprobs (#3896 )	2025-03-12 22:04:29 -07:00
Conghui Tan	6412c5e493	Avoid duplicated request ids in batch APIs (#4026 ) Co-authored-by: conghuitan <conghuitan@tencent.com>	2025-03-12 21:38:17 -07:00
Pan Lyu	361971b859	Add Support for Qwen2-VL Multi-modal Embedding Models (#3694 )	2025-03-06 16:46:20 -08:00
Xihuai Wang	95575aa76a	Reasoning parser (#4000 ) Co-authored-by: Lucas Pickup <lupickup@microsoft.com>	2025-03-03 21:16:36 -08:00
Lianmin Zheng	935cda944b	Misc clean up; Remove the support of jump forward (#4032 )	2025-03-03 07:02:14 -08:00
mlmz	bac414ab53	[Feature] integrate Structural Tag in xgrammar backend for function calling (#3566 ) Co-authored-by: shuaills <shishuaiuoe@gmail.com>	2025-02-27 23:33:41 -08:00
Lianmin Zheng	f2388f6b95	Revert "Rename TokenizerManager to StdOrchestrator" (#3828 )	2025-02-24 14:47:59 -08:00
fzyzcjy	45360b2fa9	Improve: Rename TokenizerManager to StdOrchestrator (#3116 )	2025-02-23 00:30:58 -08:00
Shi Shuai	c7c79b16cd	[Fix] OpenAI API adapter tokenizer encoding (#3432 )	2025-02-21 09:24:15 -08:00
Mick	7711ac6ed0	doc: emphasize and notify the usage of chat_template (#3589 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-02-15 00:10:32 -08:00
YAMY	b045841bae	Feature/function calling update (#2700 ) Co-authored-by: Mingyuan Ma <mamingyuan2001@berkeley.edu> Co-authored-by: Chayenne <zhaochen20@outlook.com> Co-authored-by: shuaills <shishuaiuoe@gmail.com>	2025-01-26 09:57:51 -08:00
Lianmin Zheng	bc6915e3b9	Improve type annotation and styles (#2926 )	2025-01-16 12:51:11 -08:00
Ying Sheng	dc7eb01f19	[Fix] fix openai adapter (#2685 )	2024-12-31 10:48:19 +00:00
Lianmin Zheng	8c3b420eec	[Docs] clean up structured outputs docs (#2654 )	2024-12-29 23:57:16 -08:00
Tanjiro	8ee9a8501a	[Feature] Function Calling (#2544 ) Co-authored-by: Haoyu Wang <120358163+HaoyuWang4188@users.noreply.github.com>	2024-12-28 21:58:52 -08:00
Adarsh Shirawalmath	acb340728c	[Feature] Support new parameter - EBNF in xgrammar (#2526 )	2024-12-26 05:12:41 -08:00
Lei	19ba2b0ea9	Add lora_paths to v1_chat_generate_request (#2529 )	2024-12-22 02:23:33 -08:00
Lianmin Zheng	361ea8d912	Fix openai protocols and pass top_k, min_p (#2499 )	2024-12-17 04:14:14 -08:00
Lei	33c5ff2845	Add lora_path to chat completion (#2438 )	2024-12-17 03:47:49 -08:00
Lianmin Zheng	5c18a03733	Fix logprob for completions (#2301 )	2024-12-01 05:17:05 -08:00
bjmsong	01017d4c20	Support LoRA in Completion API (#2243 ) Co-authored-by: root <bjmsong@126.com>	2024-11-29 16:13:38 -08:00
Baoyuan Qi	a4fd2f9b46	fix typo prompts (#2224 )	2024-11-27 12:07:00 -08:00
Xuehai Pan	62a4a339eb	docs: fix module docstrings and copyright headers (#2077 )	2024-11-22 22:16:53 +08:00
Alexander Waitz	929c7621af	Fix: incorrect top_logprobs in chat completion (#2088 )	2024-11-19 12:21:36 +00:00
yukavio	2a3992b6f1	support set role as 'tool' (#2075 ) Co-authored-by: kavioyu <kavioyu@tencent.com>	2024-11-18 01:06:59 -08:00
Lianmin Zheng	ea53c63bad	Expose no_stop_trim and skip_special_tokens in openai api (#2039 )	2024-11-14 19:09:21 -08:00
chottolabs	fb9fb3518b	set content to empty string (#2026 )	2024-11-14 01:06:02 +00:00
Xiaoyu Zhang	a1bd719031	fix a bug in v1_embeeding_request (#2014 )	2024-11-12 16:49:45 +08:00
Xiaoyu Zhang	027e65248f	support echo=true and logprobs in openai api when logprobs=1 in lm-evaluation-harness (#1998 )	2024-11-11 23:21:20 -08:00
Lianmin Zheng	c17c578108	Simplify tokenizer manager (#1904 )	2024-11-03 08:38:26 -08:00
Gleb Drozdov	a95d5589c3	Add matched_stop token or str to distinguish between eos or stop str finish_reason generation (#1684 )	2024-10-17 18:06:52 +00:00
Michael Feil	b0facb3316	add orjson for jsonresponse (#1688 )	2024-10-16 18:14:30 -07:00

1 2

88 Commits