Commit Graph

74 Commits

Author SHA1 Message Date
Lianmin Zheng
cdddab056c [Auto Sync] Update xgrammar_backend.py (20250913) (#10395)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-09-12 17:46:56 -07:00
Lianmin Zheng
2269cf1e2f [Auto Sync] Update base_grammar_backend.py, llguidance_back... (20250911) (#10333)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-09-12 12:55:55 -07:00
Lianmin Zheng
86d10d220f Update grok.py and tiktoken tokenizer (#9532) 2025-08-23 05:40:18 -07:00
Hubert Lu
9c3e95d98b [AMD] Expand test coverage for AMD CI and enable apply_token_bitmask_inplace_cuda in sgl-kernel (#8268) 2025-08-15 12:32:51 -07:00
Chang Su
dd487e5553 bugfix: Fix XGrammar backend to use model's EOS tokens for constrained generation (#8422) 2025-07-28 10:01:02 +08:00
Lianmin Zheng
20fd53b8f6 Correctly abort the failed grammar requests & Improve the handling of abort (#6803) 2025-06-01 19:00:07 -07:00
Lianmin Zheng
d18c6b3358 Support incremental streaming of logprob/token_ids between scheduler and detokenizer (#6225)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2025-05-12 14:33:38 -07:00
Lianmin Zheng
fba8eccd7e Log if cuda graph is used & extend cuda graph capture to cuda-graph-max-bs (#6201)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2025-05-12 00:17:33 -07:00
Lianmin Zheng
01bdbf7f80 Improve structured outputs: fix race condition, server crash, metrics and style (#6188) 2025-05-11 08:36:16 -07:00
Yixin Dong
911f3ba6f4 upgrade xgrammar to 0.1.19 (#6129) 2025-05-08 14:42:02 -07:00
Michał Moskal
bdbe5f816b update llguidance to 0.7.11; adds StructTag (#4870) 2025-04-26 20:13:57 -07:00
kyle-pena-kuzco
9f3bd2ad39 Feat: Implement JSON Mode (response_format.type="json_object") (#4733)
Co-authored-by: Kyle Pena <kylepena@kyles-macbook-pro.turkey-marlin.ts.net>
2025-04-20 17:41:22 -07:00
JieXin Liang
bca832c7c6 [Fix] fix outlines and xgrammar (#4947) 2025-04-20 13:31:25 -07:00
mlmz
7c5658c189 feat: disable grammar restrictions within reasoning sections (#4984)
Co-authored-by: tianhaoyu <thy@mail.ecust.edu.cn>
Co-authored-by: DarkSharpness <2040703891@qq.com>
2025-04-07 21:46:47 -07:00
Lianmin Zheng
4ede6770cd Fix retract for page size > 1 (#4914) 2025-03-30 02:57:15 -07:00
DarkSharpness
19120f71f3 [Fix & Style] Refactor the grammar backend to reduce human errors and improve readability (#4030) 2025-03-04 03:56:45 -08:00
Lianmin Zheng
935cda944b Misc clean up; Remove the support of jump forward (#4032) 2025-03-03 07:02:14 -08:00
Lianmin Zheng
ac2387279e Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
2025-03-03 00:12:04 -08:00
mlmz
bac414ab53 [Feature] integrate Structural Tag in xgrammar backend for function calling (#3566)
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
2025-02-27 23:33:41 -08:00
Enrique Shockwave
d281587989 Improve: Support xgrammar 0.1.14 (#3593) 2025-02-27 08:42:54 -08:00
JC1DA
7551498a69 [Feature] Support llguidance for constrained decoding (#3298) 2025-02-26 10:41:49 -08:00
Lianmin Zheng
27a46317b6 Fix dependency (#3813) 2025-02-24 03:50:58 -08:00
Yineng Zhang
85986bb978 compatible with new outlines (#3435) 2025-02-10 01:51:30 +08:00
HAI
566d61d90f ROCm: bump 6.3.0 (#3259) 2025-02-03 04:13:40 +08:00
Lianmin Zheng
61f42b5732 Move sgl.Runtime under sglang/lang (#2990) 2025-01-19 17:10:29 -08:00
Enrique Shockwave
3bcf5ecea7 support regex in xgrammar backend (#2983) 2025-01-20 04:34:41 +08:00
Adarsh Shirawalmath
acb340728c [Feature] Support new parameter - EBNF in xgrammar (#2526) 2024-12-26 05:12:41 -08:00
Lianmin Zheng
641b7d0ae0 [Minor] Improve code style (#2422) 2024-12-09 06:30:35 -08:00
Lianmin Zheng
0e7409adb6 Fix the overlap for xgrammar (#2377) 2024-12-06 05:49:29 -08:00
gobraves
906d795f15 Feat: upgrade outlines & support compatibility with the old version (#2292) 2024-12-01 02:07:27 -08:00
Yixin Dong
7f076c2ce6 Update XGrammar to the latest API (#2176)
Co-authored-by: Ben Gitter <gitterbd@gmail.com>
2024-11-25 15:58:30 -08:00
Xuehai Pan
62a4a339eb docs: fix module docstrings and copyright headers (#2077) 2024-11-22 22:16:53 +08:00
Lianmin Zheng
7d671e4ad2 Enable overlap by default (#2067) 2024-11-19 22:07:58 -08:00
DarkSharpness
9c745d078e [Performance] Update xgrammar-related constrained decoding (#2056) 2024-11-17 16:58:49 -08:00
Lianmin Zheng
c722d9bdc3 Fix dependency and error message for xgrammar (#2024) 2024-11-13 14:04:25 -08:00
Lianmin Zheng
218ab3611d Do not let invalid grammar crash the server (#2023) 2024-11-13 11:39:16 -08:00
Lianmin Zheng
54479d6f30 Fix grammar backend for tensor parallelism (#2020) 2024-11-13 01:49:45 -08:00
Lianmin Zheng
ba069a24d3 Fix grammar backend (#2018) 2024-11-12 21:17:38 -08:00
DarkSharpness
125b1199c5 support parallel grammar preprocessing (#1996)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2024-11-12 08:45:28 -08:00
DarkSharpness
b77a02cdfd [Performance] Support both xgrammar and outlines for constrained decoding (#1752) 2024-10-25 21:47:02 +00:00
zolinthecow
72e7b57a75 [Bug] Catch any errors caused by parsing json schema (#1776) 2024-10-24 01:54:53 -07:00
Ninglin Du
840c5dbcb3 [FIX] Catch syntax error of Regex Guide to avoid crash (#1521) 2024-09-28 12:42:06 -07:00
zifeitong
93dffd699b Add constrained_json_whitespace_pattern to ServerArgs (#1438) 2024-09-16 13:29:18 -07:00
Lianmin Zheng
3a6e8b6d78 [Minor] move triton attention kernels into a separate folder (#1379) 2024-09-10 15:15:08 -07:00
zifeitong
9144ed1067 Support OpenAI API json_schema response format (#1363) 2024-09-09 19:08:25 -07:00
Lianmin Zheng
e4d68afcf0 [Minor] Many cleanup (#1357) 2024-09-09 04:14:11 -07:00
Enrique Shockwave
32a4141d5a Allow new lines during JSON generation (#1277) 2024-09-01 03:42:29 -07:00
havetc
9935f97b3e [FEAT] JSON constrained support (#1125)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-08-26 09:37:26 -07:00
Liangsheng Yin
e205527cb1 Fix jump forward final state circular path bug. (#1084) 2024-08-13 21:14:05 -07:00
Lianmin Zheng
c877292cc1 Re-organize CI tests (#1052) 2024-08-12 03:39:01 -07:00