Commit Graph

137 Commits

Author SHA1 Message Date
Glen Liu
47c606d3dc [Feature] support regex strings as a stopping condition (#10635) 2025-10-12 10:53:15 +08:00
fzyzcjy
fdc4e1e570 Tiny move files to utils folder (#11166) 2025-10-03 22:40:06 +08:00
Lianmin Zheng
60e37f8028 Move parsers under a single folder (#9912) 2025-09-02 18:25:04 -07:00
Lianmin Zheng
b58ae7a2a0 Simplify frontend language (#9029) 2025-08-10 10:59:30 -07:00
Binyao Jiang
f29aba8c6e Support glm4.1v and glm4.5v (#8798)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <justinning0323@outlook.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com>
Co-authored-by: Chang Su <csu272@usc.edu>
2025-08-09 00:59:13 -07:00
RunningLeon
b7094a5ef1 model: support intern-s1 (#8350)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: zxy <zhou0493@e.ntu.edu.sg>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
2025-07-26 13:48:51 -07:00
Yudi Xue
14c18d25df Frontend language separate reasoning support (#6031) 2025-06-10 17:11:29 -07:00
Yueyang Pan
98c00a2df1 Fix torch profiler bugs for bench_offline_throughput.py (#6557) 2025-06-09 20:33:41 +08:00
Kiv Chen
5380cd7ea3 model(vlm): pixtral (#5084) 2025-05-13 00:16:10 -07:00
applesaucethebun
2ce8793519 Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-11 12:55:00 +08:00
XinyuanTong
9d8ec2e67e Fix and Clean up chat-template requirement for VLM (#6114)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-05-11 00:14:09 +08:00
xm:D
3409aaab32 Support InternVL3 (#5350)
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
2025-05-01 22:38:59 -07:00
Chuyue Sun
08289eaa3e Support o1 model on Azure (#4980)
Co-authored-by: Shan Yu <shanyu1@g.ucla.edu>
2025-04-21 00:46:09 -07:00
fzyzcjy
fba86b6b54 Tiny improve error message (#5526) 2025-04-20 16:00:15 -07:00
Lianmin Zheng
177320a582 Clean up imports (#5467) 2025-04-16 15:26:49 -07:00
Chang Su
f04c80dc42 Add Llama4 support (#5092)
Co-authored-by: Cheng Wan <cwan39@gatech.edu>
Co-authored-by: fzyzcjy <ch271828n@outlook.com>
Co-authored-by: ispobock <ispobaoke@163.com>
2025-04-07 00:29:36 -07:00
Mick
1e86457c90 model: Minicpmo (#3023) 2025-03-24 20:08:40 -07:00
Chuyue Sun
fad86a6863 Support n in OpenAI API completions (#3446)
Co-authored-by: Shan Yu <shanyu1@g.ucla.edu>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
Co-authored-by: chuyue sun <chuyue@lmsys.us-northcentral1-a.compute.internal>
2025-03-20 13:46:46 +08:00
Mick
9d02bb3e2a Urgent model support: support gemma-3-it (#4424) 2025-03-16 17:37:32 -07:00
Mick
01090e8ac3 model: Support Janus-pro (#3203) 2025-03-12 11:02:11 -07:00
Qiaolin Yu
57a404fd55 Remove outdated test utils and fix links for the doc of sampling params (#3999) 2025-03-03 09:41:38 -08:00
Lianmin Zheng
66301e124f Improve code styles (#4021) 2025-03-03 03:20:23 -08:00
Lianmin Zheng
ac2387279e Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
2025-03-03 00:12:04 -08:00
Mick
45205d88a0 bench: Add MMMU benchmark for vLM (#3562) 2025-02-22 08:10:59 -08:00
Mick
bcc213df61 Model: Support Qwen 2.5 vl (#3258) 2025-02-16 00:58:53 -08:00
Mick
7711ac6ed0 doc: emphasize and notify the usage of chat_template (#3589)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
2025-02-15 00:10:32 -08:00
Chuyue Sun
6cc309557a Add support for OpenAI API o1 model (#3363)
Co-authored-by: Shan Yu <shanyu1@g.ucla.edu>
2025-02-14 11:43:00 +08:00
Enrique Shockwave
af6c5357d5 deepseek v3 and r1 chat template (#3015) 2025-01-20 14:40:12 -08:00
Lianmin Zheng
03464890e0 Separate two entry points: Engine and HTTP server (#2996)
Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>
2025-01-19 22:09:24 -08:00
Lianmin Zheng
cd493b5afc Improve metrics, logging, and importing orders (#2992) 2025-01-19 18:36:59 -08:00
Lianmin Zheng
61f42b5732 Move sgl.Runtime under sglang/lang (#2990) 2025-01-19 17:10:29 -08:00
Mick
3d93f84a00 [Feature] Support minicpmv v2.6 (#2785)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
2025-01-18 14:14:19 -08:00
Lianmin Zheng
f65c13b559 Remove normalized_prompt_logprobs from the engine to make code easier to maintain (#2902) 2025-01-15 04:54:14 -08:00
Muqi Li
5413ec2bbe [Bugfix] Fix bug in fork logic caused by null text_ (#2835) 2025-01-10 13:37:00 -08:00
Xingyao Wang
1acbaf1b5a Add generator-style run_batch function (#2513)
Co-authored-by: openhands <openhands@all-hands.dev>
2025-01-06 15:04:55 -08:00
Adarsh Shirawalmath
acb340728c [Feature] Support new parameter - EBNF in xgrammar (#2526) 2024-12-26 05:12:41 -08:00
SangBin Cho
9208618b3e [Core] in batch prefix caching by delay scheduling (#2442) 2024-12-11 12:51:50 -08:00
Fred Reiss
993956c6b1 Add support for IBM Granite 3.x models (#2437) 2024-12-11 06:30:23 -08:00
Wang Ran (汪然)
867e092f82 using is not not != to test None (#2196) 2024-11-26 01:00:38 -08:00
Henry Hyeonmok Ko
dbe1729395 Merged three native APIs into one: get_server_info (#2152) 2024-11-24 01:37:58 -08:00
Lianmin Zheng
0abbf289a8 Unify the model type checking (#1905) 2024-11-03 12:25:39 -08:00
yizhang2077
d04899d7ca stop_str of qwen2-vl template should be a tuple not a str (#1834)
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
2024-10-29 20:30:41 +00:00
Yanyi Liu
5e6c32657e Support setting use_thread in the run_program for easier debugging. (#1823)
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
2024-10-29 06:51:47 +00:00
Lianmin Zheng
9084a86445 Update links (#1805) 2024-10-26 04:46:01 -07:00
Liangsheng Yin
94cde10920 Llama3.2 vision model support (#1551) 2024-10-21 15:01:21 -07:00
Yineng Zhang
cbbc82b7b8 Support qwen2 vl model (#1721)
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: ispobock <ISPObaoke@163.com>
2024-10-19 21:44:38 -07:00
Byron Hsu
2422de5193 Support min_tokens in sgl.gen (#1573) 2024-10-05 21:51:12 -07:00
Byron Hsu
34c32d2820 Fix styling (#1583) 2024-10-05 17:52:14 -07:00
Byron Hsu
dde8bb16fe default sampling param should be deepcopied (#1581) 2024-10-05 17:27:43 -07:00
Lianmin Zheng
4e4459b91f Multiple minor fixes (#1530) 2024-09-28 14:43:35 -07:00