sglang

Author	SHA1	Message	Date
Brayden Zhong	b149b39353	[CI] Remove unused imports with Ruff to pre-commit config, only to benchmarks/docs/examples folder (#3969 )	2025-03-27 19:45:02 -07:00
wangyu	1ce4878d31	feat(remote_model): support variable remote backend for model loader (#3964 ) Signed-off-by: wangyu <wangyu.steph@bytedance.com>	2025-03-14 00:40:44 -07:00
Qiaolin Yu	85d2365d33	Fix the output of hidden states after HTTP requests (#4269 )	2025-03-13 14:54:06 -07:00
Chitsing KUI	959a3143fc	example: add async offline inference demo (#3961 ) Signed-off-by: joeshikui <joeshikui@tencent.com> Co-authored-by: joeshikui <joeshikui@tencent.com>	2025-03-12 21:41:21 -07:00
simveit	007f8b3dc2	Added example for multimodal embedding (#4206 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-03-10 00:53:56 -07:00
Qiaolin Yu	357671e216	Add examples for server token-in-token-out (#4103 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-03-05 13:16:31 -08:00
Mick	583d6af71b	example: add vlm to token in & out example (#3941 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-03-04 22:18:26 -08:00
Qiaolin Yu	4725e3f652	Add examples for returning hidden states when using the server (#4074 )	2025-03-04 19:31:50 -08:00
Lianmin Zheng	935cda944b	Misc clean up; Remove the support of jump forward (#4032 )	2025-03-03 07:02:14 -08:00
Chayenne	728e175fc4	Add examples to token-in-token-out for LLM (#4010 )	2025-03-02 21:03:49 -08:00
Qiaolin Yu	40782f05d7	Refactor: Move return_hidden_states to the generate input (#3985 ) Co-authored-by: Beichen-Ma <mabeichen12@gmail.com>	2025-03-01 17:51:29 -08:00
fzyzcjy	e3e0bc50a9	[Feature] SPMD for SGLang + Verl (#3852 )	2025-02-28 09:53:10 -08:00
KCFindstr	bc20e93f2d	[feat] Add Vertex AI compatible prediction route for /generate (#3866 )	2025-02-27 19:42:15 -08:00
Qiaolin Yu	d6898dd253	Add return hidden state in the native API (#3897 ) Co-authored-by: Beichen-Ma <mabeichen12@gmail.com> Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-02-26 22:06:54 -08:00
simveit	acd1a15921	Docs: Implemented frontend docs (#3791 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-02-26 15:30:05 -08:00
Shi Shuai	e074e76b31	docs: Add offline engine launch example and documentation (#3771 )	2025-02-21 11:25:52 -08:00
Shenggui Li	fb4c9c3a30	[fix] added support for vlm in offline inference (#3548 )	2025-02-15 05:27:29 +08:00
Yineng Zhang	013021b6a1	refactor EAGLE 2 (#3269 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: merrymercy <lianminzheng@gmail.com> Co-authored-by: Ying1123 <sqy1415@gmail.com>	2025-02-03 20:52:30 +08:00
Lianmin Zheng	cd493b5afc	Improve metrics, logging, and importing orders (#2992 )	2025-01-19 18:36:59 -08:00
Lianmin Zheng	61f42b5732	Move sgl.Runtime under sglang/lang (#2990 )	2025-01-19 17:10:29 -08:00
yukavio	815dce0554	Eagle speculative decoding part 4: Add EAGLE2 worker (#2150 ) Co-authored-by: kavioyu <kavioyu@tencent.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2025-01-02 03:22:34 -08:00
Qun Yang	37ee906f61	Add more support for intel Gaudi accelerators (#2357 )	2024-12-06 01:16:33 -08:00
James Xu	9d427265fd	Add Engine::encode example (#2000 )	2024-11-11 13:43:35 -08:00
Chayenne	c77c1e05ba	fix black in pre-commit (#1940 )	2024-11-08 07:42:47 +08:00
Xuehai Pan	a5e0defb5a	minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926 )	2024-11-06 13:46:04 +00:00
Chayenne	02755768d3	Change judge to classify & Modify make file (#1920 )	2024-11-04 23:53:44 -08:00
Byron Hsu	6fcd6d7d6d	Support token ids in `engine.generate` (#1820 )	2024-10-27 14:02:34 -07:00
Byron Hsu	862cd265e5	[engine] support async and streaming (#1614 )	2024-10-11 15:26:25 -07:00
Byron Hsu	551a3a9d38	Provide an offline engine API (#1567 )	2024-10-06 20:27:03 -07:00
Theresa Barton	2c7d0a5b8b	[Fix] Fix all the Huggingface paths (#1553 )	2024-10-02 10:12:07 -07:00
Ying Sheng	0f4fb19bc8	[Fix, LoRA] fix LoRA with updates in main (#1545 )	2024-09-30 10:06:08 -07:00
Lianmin Zheng	4e4459b91f	Multiple minor fixes (#1530 )	2024-09-28 14:43:35 -07:00
Ying Sheng	9aa6553d2a	[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B (#1525 )	2024-09-27 23:32:11 -07:00
Ying Sheng	e4780cf839	[API, Feature] Support response prefill for openai API (#1490 )	2024-09-22 06:46:17 -07:00
Li Bo	446ea33277	fix: creat new dict everytime for putting new frame (#1464 )	2024-09-19 01:31:48 -07:00
Ying Sheng	37963394aa	[Feature] Support LoRA path renaming and add LoRA serving benchmarks (#1433 )	2024-09-15 12:46:04 -07:00
Kaichen Zhang - NTU	662ecd9368	[Feat] Add modalities for vision server when handling pixel values for llava (#1346 )	2024-09-09 02:07:34 -07:00
Lianmin Zheng	0a97d7962d	[Fix] Fix OOM in llava base class (#1249 )	2024-08-28 08:45:49 -07:00
Kaichen Zhang - NTU	66e7dcaf70	[Fix] Fixing the multi-images error for llava-onevision (#1205 )	2024-08-25 10:28:23 -07:00
Lianmin Zheng	f6af3a6561	Cleanup readme, llava examples, usage examples and nccl init (#1194 )	2024-08-24 08:02:23 -07:00

40 Commits