sglang

Author	SHA1	Message	Date
fzyzcjy	fd28640dc5	Add `update_weights_from_tensor` (#2631 )	2024-12-28 13:30:27 -08:00
Lianmin Zheng	a6ca736c8e	Simplify stream_output (#2398 )	2024-12-08 12:27:13 -08:00
Chayenne	983bfcf386	Online weight updates from torch.distributed (#2279 )	2024-12-01 23:23:18 -08:00
Chayenne	7d1485d376	Add get weights by parameter name for llama (#2266 )	2024-11-29 23:36:38 -08:00
Chayenne	7d5d1d3d29	udate weights from disk (#2265 )	2024-11-30 01:17:00 +00:00
Lianmin Zheng	fe97a2d40f	Simplify tokenizer manager (#2254 )	2024-11-29 02:18:51 -08:00
Rin Intachuen	1aea19f64b	Input_embeds support (#2052 )	2024-11-25 16:35:04 -08:00
Ying Sheng	e1e595d702	[feat] Refactor session control interface and add CI (#2173 )	2024-11-25 12:32:51 -08:00
Xuehai Pan	62a4a339eb	docs: fix module docstrings and copyright headers (#2077 )	2024-11-22 22:16:53 +08:00
Ying Sheng	5942dfc00a	[feat] Add session control (#2073 )	2024-11-20 00:36:53 -08:00
Lianmin Zheng	befc6beb86	Fix a typo in io_struct.py (#2008 )	2024-11-11 16:34:10 -08:00
Chayenne	c77c1e05ba	fix black in pre-commit (#1940 )	2024-11-08 07:42:47 +08:00
Lianmin Zheng	2ce32db6fb	Let reward model take text inputs instead of message lists (#1907 ) Co-authored-by: Kyle Corbitt <kyle@corbt.com>	2024-11-03 13:27:12 -08:00
Lianmin Zheng	c17c578108	Simplify tokenizer manager (#1904 )	2024-11-03 08:38:26 -08:00
Lianmin Zheng	838dcda162	Simplify tokenizer manager (#1899 )	2024-11-03 03:52:38 -08:00
Lianmin Zheng	fb99aaa527	[Fix] Fix --skip-tokenizer-init (#1798 )	2024-10-25 18:51:59 -07:00
Ying Sheng	2fce449b1c	[API] add get memory pool size (#1760 ) Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>	2024-10-23 07:02:29 +00:00
Lianmin Zheng	7feba41584	Fix failed ci tests on long prompts; Better error messages for embedding models (#1700 )	2024-10-17 09:23:29 -07:00
Ying Sheng	2725f8da61	[Minor] Rename no_eos_trim to no_stop_trim (#1661 )	2024-10-13 20:30:03 -07:00
Ying Sheng	4876117171	[Fix] fix eos trim inconsistency (#1650 )	2024-10-13 01:07:09 -07:00
科英	bbd72bfc86	Add the ability to enable and disable the Profiler via HTTP API. (#1626 )	2024-10-11 02:34:25 -07:00
Yiding-Lu	b503881bd2	[Bug] Fix the Image Input of Batch Generation (#1579 )	2024-10-11 02:25:04 -07:00
Byron Hsu	521f862d90	Fix runtime.generate when sampling param is not passed (#1582 )	2024-10-05 17:59:05 -07:00
Ying Sheng	f202ed9712	[Refactor] Simplify io_struct and tokenizer_manager (#1549 )	2024-10-01 10:25:32 -07:00
Lianmin Zheng	f86c1e611f	Move scheduler code from tp_worker.py to scheduler.py (#1538 )	2024-09-29 17:42:45 -07:00
Liangsheng Yin	fd9ad817ec	Organize image inputs (#1531 )	2024-09-29 06:28:55 +00:00
Ying Sheng	9aa6553d2a	[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B (#1525 )	2024-09-27 23:32:11 -07:00
Xiao Yu	5752f25eef	Fixed n>1 causing list index out of range with VLM (#1449 )	2024-09-18 00:46:32 -07:00
Lianmin Zheng	9ba1f09760	[Fix] Fix logprob and normalized_logprob (#1428 )	2024-09-15 06:36:06 -07:00
Ying Sheng	712216928f	[Feature] Initial support for multi-LoRA serving (#1307 )	2024-09-12 16:46:14 -07:00
wangchao	fec2d1223c	[Fix] fix bug of `undefined is_single` in meth `create_abort_task` (#1370 )	2024-09-10 01:17:37 -07:00
Kaichen Zhang - NTU	662ecd9368	[Feat] Add modalities for vision server when handling pixel values for llava (#1346 )	2024-09-09 02:07:34 -07:00
Zhiqiang Xie	8153168c96	fix data racing due to mutable reference using deepcopy (#1255 )	2024-08-28 18:57:54 -07:00
Lianmin Zheng	bf53bf5142	[Fix] Fix llava on multi images (#1247 )	2024-08-28 06:33:05 -07:00
Liangsheng Yin	83e23c69b3	Improve code style of sampler (#1168 )	2024-08-21 16:48:24 -07:00
Shan Yu	cd10654e7e	[Feat] Support update weights without restart server (#1157 )	2024-08-20 13:48:24 -07:00
yichuan~	b997a18d74	[Feat]Add support for optional start len of logprobs (#1035 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: Yineng Zhang <me@zhyncs.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>	2024-08-18 23:45:41 -07:00
Lianmin Zheng	cdc8d60752	Improve the code style: more comments and remove useless packages (#1139 )	2024-08-17 14:37:52 -07:00
Ying Sheng	b68c4c073b	fix: force max new tokens to be 1 for embedding request (#1019 )	2024-08-10 13:46:42 -07:00
Ying Sheng	b16e856f11	Add openai embedding API (#997 )	2024-08-09 11:19:18 -07:00
Ying Sheng	20a4f927dc	Add io struct for embedding models [unreachable code] - step 2/3 (#987 )	2024-08-08 07:52:31 +00:00
yichuan~	d53dcf9c98	Support more OpenAI API test (#916 )	2024-08-04 16:43:09 -07:00
Liangsheng Yin	cdcbde5fc3	Code structure refactor (#807 )	2024-07-29 23:04:48 -07:00
yichuan~	084fa54d37	Add support for OpenAI API : offline batch(file) processing (#699 ) Co-authored-by: hnyls2002 <hnyls2002@gmail.com>	2024-07-29 13:07:18 -07:00
Yineng Zhang	dd7e8b9421	chore: add copyright for srt (#790 )	2024-07-28 23:07:12 +10:00
Lianmin Zheng	30db99b3d9	Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776 )	2024-07-27 19:50:34 -07:00
Yineng Zhang	6b32bb1c0b	misc: format (#741 )	2024-07-26 21:00:51 +10:00
Toshiki Kataoka	da504445dc	fix /generate without sampling_params (#734 )	2024-07-26 01:27:56 -07:00
Liangsheng Yin	f424e76d96	Fix illegal tokens during sampling (#676 )	2024-07-20 03:11:15 -07:00
Lianmin Zheng	35759efa91	Support random dataset in bench_serving.py (#669 )	2024-07-20 01:06:43 -07:00

1 2

81 Commits