sglang

Author	SHA1	Message	Date
wangyu	9f81d741a2	fix: fix MLA for ShardedModelLoader/RemoteModelLoader (#6287 ) Signed-off-by: wangyu <wangyu.steph@bytedance.com>	2025-08-28 16:10:09 -07:00
wangyu	a38c149758	feat(draft_model): support draft_model for RemoteModelLoader (#6407 ) Signed-off-by: wangyu <wangyu.steph@bytedance.com>	2025-08-28 16:09:52 -07:00
PGFLMG	b7cd743038	[Feat] QWen-1M context support[2/2]: Update block sparse attention backend (#5949 )	2025-08-06 23:49:36 -07:00
Ata Fatahi	1ab6be1b26	Purge VerlEngine (#7326 ) Signed-off-by: Ata Fatahi <immrata@gmail.com>	2025-06-19 23:47:21 -07:00
Lifu Huang	3cf1473a09	Use monotonic clock for interval measurement (#6211 ) Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>	2025-05-17 16:49:18 -07:00
XinyuanTong	9d8ec2e67e	Fix and Clean up chat-template requirement for VLM (#6114 ) Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>	2025-05-11 00:14:09 +08:00
Michael Yao	269c457e05	[Docs] Update runtime/engine/readme.md (#5737 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-04-25 16:39:29 -07:00
Ravi Theja	d2b8d0b8d8	Add example to use sgl engine with fastapi (#5648 ) Co-authored-by: Ravi Theja Desetty <ravitheja@Ravis-MacBook-Pro.local>	2025-04-24 23:57:05 +08:00
Brayden Zhong	b149b39353	[CI] Remove unused imports with Ruff to pre-commit config, only to benchmarks/docs/examples folder (#3969 )	2025-03-27 19:45:02 -07:00
wangyu	1ce4878d31	feat(remote_model): support variable remote backend for model loader (#3964 ) Signed-off-by: wangyu <wangyu.steph@bytedance.com>	2025-03-14 00:40:44 -07:00
Chitsing KUI	959a3143fc	example: add async offline inference demo (#3961 ) Signed-off-by: joeshikui <joeshikui@tencent.com> Co-authored-by: joeshikui <joeshikui@tencent.com>	2025-03-12 21:41:21 -07:00
Qiaolin Yu	357671e216	Add examples for server token-in-token-out (#4103 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-03-05 13:16:31 -08:00
Mick	583d6af71b	example: add vlm to token in & out example (#3941 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-03-04 22:18:26 -08:00
Qiaolin Yu	4725e3f652	Add examples for returning hidden states when using the server (#4074 )	2025-03-04 19:31:50 -08:00
Lianmin Zheng	935cda944b	Misc clean up; Remove the support of jump forward (#4032 )	2025-03-03 07:02:14 -08:00
Chayenne	728e175fc4	Add examples to token-in-token-out for LLM (#4010 )	2025-03-02 21:03:49 -08:00
Qiaolin Yu	40782f05d7	Refactor: Move return_hidden_states to the generate input (#3985 ) Co-authored-by: Beichen-Ma <mabeichen12@gmail.com>	2025-03-01 17:51:29 -08:00
fzyzcjy	e3e0bc50a9	[Feature] SPMD for SGLang + Verl (#3852 )	2025-02-28 09:53:10 -08:00
Qiaolin Yu	d6898dd253	Add return hidden state in the native API (#3897 ) Co-authored-by: Beichen-Ma <mabeichen12@gmail.com> Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-02-26 22:06:54 -08:00
simveit	acd1a15921	Docs: Implemented frontend docs (#3791 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-02-26 15:30:05 -08:00
Shi Shuai	e074e76b31	docs: Add offline engine launch example and documentation (#3771 )	2025-02-21 11:25:52 -08:00
Shenggui Li	fb4c9c3a30	[fix] added support for vlm in offline inference (#3548 )	2025-02-15 05:27:29 +08:00
Yineng Zhang	013021b6a1	refactor EAGLE 2 (#3269 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: merrymercy <lianminzheng@gmail.com> Co-authored-by: Ying1123 <sqy1415@gmail.com>	2025-02-03 20:52:30 +08:00
Lianmin Zheng	cd493b5afc	Improve metrics, logging, and importing orders (#2992 )	2025-01-19 18:36:59 -08:00
yukavio	815dce0554	Eagle speculative decoding part 4: Add EAGLE2 worker (#2150 ) Co-authored-by: kavioyu <kavioyu@tencent.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2025-01-02 03:22:34 -08:00
Qun Yang	37ee906f61	Add more support for intel Gaudi accelerators (#2357 )	2024-12-06 01:16:33 -08:00
James Xu	9d427265fd	Add Engine::encode example (#2000 )	2024-11-11 13:43:35 -08:00
Chayenne	c77c1e05ba	fix black in pre-commit (#1940 )	2024-11-08 07:42:47 +08:00
Xuehai Pan	a5e0defb5a	minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926 )	2024-11-06 13:46:04 +00:00
Byron Hsu	6fcd6d7d6d	Support token ids in `engine.generate` (#1820 )	2024-10-27 14:02:34 -07:00
Byron Hsu	862cd265e5	[engine] support async and streaming (#1614 )	2024-10-11 15:26:25 -07:00

31 Commits