sglang

Author	SHA1	Message	Date
Yudi Xue	14c18d25df	Frontend language separate reasoning support (#6031 )	2025-06-10 17:11:29 -07:00
Chuyue Sun	fad86a6863	Support `n` in OpenAI API completions (#3446 ) Co-authored-by: Shan Yu <shanyu1@g.ucla.edu> Co-authored-by: Yineng Zhang <me@zhyncs.com> Co-authored-by: chuyue sun <chuyue@lmsys.us-northcentral1-a.compute.internal>	2025-03-20 13:46:46 +08:00
Muqi Li	5413ec2bbe	[Bugfix] Fix bug in fork logic caused by null text_ (#2835 )	2025-01-10 13:37:00 -08:00
Xingyao Wang	1acbaf1b5a	Add generator-style run_batch function (#2513 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-01-06 15:04:55 -08:00
Yanyi Liu	5e6c32657e	Support setting `use_thread` in the `run_program` for easier debugging. (#1823 ) Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>	2024-10-29 06:51:47 +00:00
Byron Hsu	2422de5193	Support min_tokens in sgl.gen (#1573 )	2024-10-05 21:51:12 -07:00
Byron Hsu	dde8bb16fe	default sampling param should be deepcopied (#1581 )	2024-10-05 17:27:43 -07:00
Lianmin Zheng	899cf5c438	Remove deprecated configs (#1431 )	2024-09-15 08:52:18 -07:00
Lianmin Zheng	9ba1f09760	[Fix] Fix logprob and normalized_logprob (#1428 )	2024-09-15 06:36:06 -07:00
Max Shawabkeh	6def9b018c	Fix hang when doing s += None. (#1297 ) Co-authored-by: max99x <mshawabkeh@jamandtea.studio>	2024-09-01 21:56:33 -07:00
Enrique Shockwave	6c34d6339c	make json_schema usable from gen (#1254 )	2024-08-28 18:57:10 -07:00
intervitens	068e9eae55	Support min-p sampling (#1167 )	2024-08-21 22:49:32 +00:00
Liangsheng Yin	73cf6834f2	Support `stop_token_ids` in sglang API (#1092 )	2024-08-15 00:31:39 +00:00
Aidan Cooper	94e0115186	Feat: add alternative choices selection methods (#835 )	2024-08-05 03:27:49 -07:00
Kai Fronsdal	0c0c81372e	Fix #857 (#858 )	2024-08-01 00:05:39 -07:00
ObjectNotFound	daf593a385	Fix streaming bug (#820 )	2024-07-30 00:32:07 -07:00
ObjectNotFound	8f6274c82b	Add role documentation, add system begin & end tokens (#793 )	2024-07-28 23:02:49 -07:00
Lianmin Zheng	30db99b3d9	Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776 )	2024-07-27 19:50:34 -07:00
Max Shawabkeh	5ad033a070	Fix StreamExecutor.fork() losing the current role start index. (#684 )	2024-07-20 23:32:11 -07:00
胡译文	02b7258658	[Feat] Expose logprob options to `sgl.gen` API (#503 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2024-07-09 00:35:39 -07:00
Mingyi	c0982ac553	Fix Llava model (#594 )	2024-07-06 00:58:46 -07:00
Ying Sheng	fb9296f0ed	Higher priority for user input of max_prefill_tokens & format (#540 )	2024-06-12 21:48:40 -07:00
Lianmin Zheng	ced77c6626	Rename api_num_spec_tokens -> num_api_spec_tokens (#458 )	2024-05-20 18:44:23 -07:00
Lianmin Zheng	8dbdc018a3	Abort disconnected requests (#457 )	2024-05-20 18:41:21 -07:00
Ying Sheng	3e684be7a3	Fix openai speculative execution (#456 )	2024-05-20 17:01:13 -07:00
LiviaSun	ec380dfd30	openai chat speculative execution (#250 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-05-18 22:23:53 -07:00
Lianmin Zheng	8210ec60f4	Improve error handling & abort disconnected requests (#449 )	2024-05-17 05:49:31 -07:00
Liangsheng Yin	690d162d97	Format code (#441 )	2024-05-14 22:40:46 +08:00
Yuanhan Zhang	0992d85f92	support llava video (#426 )	2024-05-13 16:57:00 -07:00
Lianmin Zheng	5dc55a5f02	Handle truncation errors (#436 )	2024-05-13 15:56:00 -07:00
Lianmin Zheng	562b8857d8	Improve error handling (#433 )	2024-05-12 20:49:04 -07:00
Qubitium	33b242df30	Compat with latest VLLM 0.4.2 main + fork.number rename + Flashinfer 0.0.4 (#380 ) Co-authored-by: ZX <zx@lbx.dev> Co-authored-by: ZhouXingg <165115237+ZhouXingg@users.noreply.github.com>	2024-05-11 16:37:49 -07:00
Liangsheng Yin	d5de20a3ee	Fix `sync()` when `fork(1)` (#412 )	2024-05-08 15:15:18 +08:00
Joschka Braun	5c5aba5900	Adding RAG tracing & eval cookbook using Parea (#390 )	2024-04-30 16:13:28 -07:00
Liangsheng Yin	150d7020ed	Revert removing the unused imports (#385 )	2024-04-23 22:36:33 +08:00
Liangsheng Yin	9acc6e3504	add `.isort.cfg` (#378 )	2024-04-22 22:38:09 +08:00
Liangsheng Yin	1bf1cf1953	Reduce overhead when `fork(1)` (#375 )	2024-04-21 17:25:14 +08:00
SimoneRaponi	ff99c38a07	Add timeout to get_meta_info (#346 ) Co-authored-by: simone <simone.raponi@equixely.com>	2024-04-03 22:22:06 +08:00
Junlong Li	cb389c91bc	Fix llava parallelism/fork bug (#315 )	2024-03-28 19:24:54 -07:00
Liangsheng Yin	3842eba5fa	Logprobs Refractor (#331 )	2024-03-28 14:34:49 +08:00
Lin Tianchuan	30d67b2bca	Add `set_var` to interpreter.py (#263 )	2024-03-07 23:20:11 +08:00
Zhang Wenbin	8d0a7fae3b	Fix interpreter.py `get_var(var_name)` in text iter when `stream` is not enabled (#198 )	2024-02-24 16:27:34 +08:00
Liangsheng Yin	c4e9ebe3a4	Fix stop str merging (#225 ) Co-authored-by: Enrique Shockwave <33002121+qeternity@users.noreply.github.com>	2024-02-24 16:05:21 +08:00
Ying Sheng	67be11c790	fix bug of race condition in copy()	2024-02-03 01:38:00 -08:00
Lianmin Zheng	0617528632	Update quick start examples (#120 )	2024-01-30 04:29:32 -08:00
parasol-aser	23950056f0	support speculative execution for openai API (#48 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-01-25 01:57:06 -08:00
Liangsheng Yin	01ee0fbc05	fast regex decode Auto-detect constant str path in regex FSM, then extend instead.	2024-01-25 01:16:25 +08:00
Lianmin Zheng	7358fa64f7	Fix a bug in runtime backend	2024-01-23 22:10:17 +00:00
Lianmin Zheng	9a16fea012	Return logprob for choices (#87 )	2024-01-23 05:07:30 -08:00
Lianmin Zheng	959c4174b2	Fix the chat template for QWen (#83 )	2024-01-22 21:46:47 -08:00

1 2

59 Commits