Yudi Xue
|
14c18d25df
|
Frontend language separate reasoning support (#6031)
|
2025-06-10 17:11:29 -07:00 |
|
Chuyue Sun
|
fad86a6863
|
Support n in OpenAI API completions (#3446)
Co-authored-by: Shan Yu <shanyu1@g.ucla.edu>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
Co-authored-by: chuyue sun <chuyue@lmsys.us-northcentral1-a.compute.internal>
|
2025-03-20 13:46:46 +08:00 |
|
Muqi Li
|
5413ec2bbe
|
[Bugfix] Fix bug in fork logic caused by null text_ (#2835)
|
2025-01-10 13:37:00 -08:00 |
|
Xingyao Wang
|
1acbaf1b5a
|
Add generator-style run_batch function (#2513)
Co-authored-by: openhands <openhands@all-hands.dev>
|
2025-01-06 15:04:55 -08:00 |
|
Yanyi Liu
|
5e6c32657e
|
Support setting use_thread in the run_program for easier debugging. (#1823)
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
|
2024-10-29 06:51:47 +00:00 |
|
Byron Hsu
|
2422de5193
|
Support min_tokens in sgl.gen (#1573)
|
2024-10-05 21:51:12 -07:00 |
|
Byron Hsu
|
dde8bb16fe
|
default sampling param should be deepcopied (#1581)
|
2024-10-05 17:27:43 -07:00 |
|
Lianmin Zheng
|
899cf5c438
|
Remove deprecated configs (#1431)
|
2024-09-15 08:52:18 -07:00 |
|
Lianmin Zheng
|
9ba1f09760
|
[Fix] Fix logprob and normalized_logprob (#1428)
|
2024-09-15 06:36:06 -07:00 |
|
Max Shawabkeh
|
6def9b018c
|
Fix hang when doing s += None. (#1297)
Co-authored-by: max99x <mshawabkeh@jamandtea.studio>
|
2024-09-01 21:56:33 -07:00 |
|
Enrique Shockwave
|
6c34d6339c
|
make json_schema usable from gen (#1254)
|
2024-08-28 18:57:10 -07:00 |
|
intervitens
|
068e9eae55
|
Support min-p sampling (#1167)
|
2024-08-21 22:49:32 +00:00 |
|
Liangsheng Yin
|
73cf6834f2
|
Support stop_token_ids in sglang API (#1092)
|
2024-08-15 00:31:39 +00:00 |
|
Aidan Cooper
|
94e0115186
|
Feat: add alternative choices selection methods (#835)
|
2024-08-05 03:27:49 -07:00 |
|
Kai Fronsdal
|
0c0c81372e
|
Fix #857 (#858)
|
2024-08-01 00:05:39 -07:00 |
|
ObjectNotFound
|
daf593a385
|
Fix streaming bug (#820)
|
2024-07-30 00:32:07 -07:00 |
|
ObjectNotFound
|
8f6274c82b
|
Add role documentation, add system begin & end tokens (#793)
|
2024-07-28 23:02:49 -07:00 |
|
Lianmin Zheng
|
30db99b3d9
|
Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776)
|
2024-07-27 19:50:34 -07:00 |
|
Max Shawabkeh
|
5ad033a070
|
Fix StreamExecutor.fork() losing the current role start index. (#684)
|
2024-07-20 23:32:11 -07:00 |
|
胡译文
|
02b7258658
|
[Feat] Expose logprob options to sgl.gen API (#503)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2024-07-09 00:35:39 -07:00 |
|
Mingyi
|
c0982ac553
|
Fix Llava model (#594)
|
2024-07-06 00:58:46 -07:00 |
|
Ying Sheng
|
fb9296f0ed
|
Higher priority for user input of max_prefill_tokens & format (#540)
|
2024-06-12 21:48:40 -07:00 |
|
Lianmin Zheng
|
ced77c6626
|
Rename api_num_spec_tokens -> num_api_spec_tokens (#458)
|
2024-05-20 18:44:23 -07:00 |
|
Lianmin Zheng
|
8dbdc018a3
|
Abort disconnected requests (#457)
|
2024-05-20 18:41:21 -07:00 |
|
Ying Sheng
|
3e684be7a3
|
Fix openai speculative execution (#456)
|
2024-05-20 17:01:13 -07:00 |
|
LiviaSun
|
ec380dfd30
|
openai chat speculative execution (#250)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-05-18 22:23:53 -07:00 |
|
Lianmin Zheng
|
8210ec60f4
|
Improve error handling & abort disconnected requests (#449)
|
2024-05-17 05:49:31 -07:00 |
|
Liangsheng Yin
|
690d162d97
|
Format code (#441)
|
2024-05-14 22:40:46 +08:00 |
|
Yuanhan Zhang
|
0992d85f92
|
support llava video (#426)
|
2024-05-13 16:57:00 -07:00 |
|
Lianmin Zheng
|
5dc55a5f02
|
Handle truncation errors (#436)
|
2024-05-13 15:56:00 -07:00 |
|
Lianmin Zheng
|
562b8857d8
|
Improve error handling (#433)
|
2024-05-12 20:49:04 -07:00 |
|
Qubitium
|
33b242df30
|
Compat with latest VLLM 0.4.2 main + fork.number rename + Flashinfer 0.0.4 (#380)
Co-authored-by: ZX <zx@lbx.dev>
Co-authored-by: ZhouXingg <165115237+ZhouXingg@users.noreply.github.com>
|
2024-05-11 16:37:49 -07:00 |
|
Liangsheng Yin
|
d5de20a3ee
|
Fix sync() when fork(1) (#412)
|
2024-05-08 15:15:18 +08:00 |
|
Joschka Braun
|
5c5aba5900
|
Adding RAG tracing & eval cookbook using Parea (#390)
|
2024-04-30 16:13:28 -07:00 |
|
Liangsheng Yin
|
150d7020ed
|
Revert removing the unused imports (#385)
|
2024-04-23 22:36:33 +08:00 |
|
Liangsheng Yin
|
9acc6e3504
|
add .isort.cfg (#378)
|
2024-04-22 22:38:09 +08:00 |
|
Liangsheng Yin
|
1bf1cf1953
|
Reduce overhead when fork(1) (#375)
|
2024-04-21 17:25:14 +08:00 |
|
SimoneRaponi
|
ff99c38a07
|
Add timeout to get_meta_info (#346)
Co-authored-by: simone <simone.raponi@equixely.com>
|
2024-04-03 22:22:06 +08:00 |
|
Junlong Li
|
cb389c91bc
|
Fix llava parallelism/fork bug (#315)
|
2024-03-28 19:24:54 -07:00 |
|
Liangsheng Yin
|
3842eba5fa
|
Logprobs Refractor (#331)
|
2024-03-28 14:34:49 +08:00 |
|
Lin Tianchuan
|
30d67b2bca
|
Add set_var to interpreter.py (#263)
|
2024-03-07 23:20:11 +08:00 |
|
Zhang Wenbin
|
8d0a7fae3b
|
Fix interpreter.py get_var(var_name) in text iter when stream is not enabled (#198)
|
2024-02-24 16:27:34 +08:00 |
|
Liangsheng Yin
|
c4e9ebe3a4
|
Fix stop str merging (#225)
Co-authored-by: Enrique Shockwave <33002121+qeternity@users.noreply.github.com>
|
2024-02-24 16:05:21 +08:00 |
|
Ying Sheng
|
67be11c790
|
fix bug of race condition in copy()
|
2024-02-03 01:38:00 -08:00 |
|
Lianmin Zheng
|
0617528632
|
Update quick start examples (#120)
|
2024-01-30 04:29:32 -08:00 |
|
parasol-aser
|
23950056f0
|
support speculative execution for openai API (#48)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2024-01-25 01:57:06 -08:00 |
|
Liangsheng Yin
|
01ee0fbc05
|
fast regex decode
Auto-detect constant str path in regex FSM, then extend instead.
|
2024-01-25 01:16:25 +08:00 |
|
Lianmin Zheng
|
7358fa64f7
|
Fix a bug in runtime backend
|
2024-01-23 22:10:17 +00:00 |
|
Lianmin Zheng
|
9a16fea012
|
Return logprob for choices (#87)
|
2024-01-23 05:07:30 -08:00 |
|
Lianmin Zheng
|
959c4174b2
|
Fix the chat template for QWen (#83)
|
2024-01-22 21:46:47 -08:00 |
|