sglang

Author	SHA1	Message	Date
Even Zhou	5b64f006ec	[Feature] Support DeepEP normal & Redundant Experts on NPU (#9881 )	2025-09-10 20:35:26 -07:00
Hubert Lu	91b3555d2d	Add tests to AMD CI for MI35x (#9662 ) Co-authored-by: Sai Enduri <saimanas.enduri@amd.com>	2025-09-10 12:50:05 -07:00
Lzhang-hub	4efe2c57c9	support vlm model spec bench (#10173 )	2025-09-10 13:37:04 +08:00
Lianmin Zheng	bcf1955f7e	Revert "chore: upgrade v0.3.9 sgl-kernel" (#10245 )	2025-09-09 19:05:20 -07:00
Yineng Zhang	d3ee70985f	chore: upgrade v0.3.9 sgl-kernel (#10220 )	2025-09-09 03:16:25 -07:00
Liangsheng Yin	6e95f5e5bd	Simplify `Router` arguments passing and build it in docker image (#9964 )	2025-09-05 12:13:55 +08:00
Yineng Zhang	de9217334b	feat: add gpt oss b200 ci (#9988 )	2025-09-03 17:26:38 -07:00
Lianmin Zheng	646076b71e	Update guidelines for syncing code between repos (#9831 )	2025-08-30 16:10:35 -07:00
Lianmin Zheng	0d04008936	[CI] Code sync tools (#9830 )	2025-08-30 16:02:29 -07:00
Chayenne	9b08d975a0	[docs] Refactor, remove compiled results and add gpt-oss (#9613 ) Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com>	2025-08-25 15:27:06 -07:00
Chang Su	7638f5e44e	[router] Implement gRPC SGLangSchedulerClient (#9364 )	2025-08-19 16:44:11 -07:00
Lianmin Zheng	c480a3f6ea	Minor style fixes for sgl-kernel (#9289 )	2025-08-18 09:38:35 -07:00
michael-amd	0fc8bf2cd4	[AMD] Update fallback images for AMD CI (#9159 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-08-13 20:15:10 -07:00
li chaoran	2ecbd8b8bf	[feat] add ascend readme and docker release (#8700 ) Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com> Signed-off-by: lichaoran <pkwarcraft@gmail.com> Co-authored-by: Even Zhou <even.y.zhou@outlook.com> Co-authored-by: ronnie_zheng <zl19940307@163.com>	2025-08-12 13:25:42 -07:00
Yi Zhang	89f1d4f536	update deepep commit to support qwen3-coder (#9066 )	2025-08-11 10:42:33 -07:00
Cheng Wan	f003cd3548	[CI] Fix CI tests (#9050 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-08-10 23:52:05 -07:00
Lianmin Zheng	2c7f01bc89	Reorganize CI and test files (#9027 )	2025-08-10 12:30:06 -07:00
Lianmin Zheng	706bd69cc5	Clean up server_args.py to have a dedicated function for model specific adjustments (#8983 )	2025-08-08 19:56:50 -07:00
michael-amd	23f2afb2ce	[AMD] Update SGLang image fallback logic for AMD CI (#8980 )	2025-08-08 18:51:29 -07:00
fzyzcjy	482c3db29f	Fix sgl-kernel arch and missing package in CI (#8869 )	2025-08-07 02:08:15 -07:00
michael-amd	4f2e1490c3	[AMD] Pull latest SGLang version for AMD CI (#8787 )	2025-08-06 20:20:26 -07:00
Yineng Zhang	cbbd685a46	chore: use torch 2.8 stable (#8880 )	2025-08-06 15:51:40 -07:00
Cheng Wan	78aad91037	[CI] fix pip upgrade (#8881 )	2025-08-06 15:02:32 -07:00
fzyzcjy	b114a8105b	Support B200 in CI (#8861 )	2025-08-06 21:42:44 +08:00
Yineng Zhang	3ae8e3ea8f	chore: upgrade torch 2.8.0 (#8836 )	2025-08-05 17:32:01 -07:00
kk	32d9e39a29	Fix potential memory fault issue and ncclSystemError in CI test (#8681 ) Co-authored-by: wunhuang <wunhuang@amd.com>	2025-08-05 12:19:37 -07:00
Even Zhou	fee0ab0fba	[CI] Ascend NPU CI enhancement (#8294 ) Co-authored-by: ronnie_zheng <zl19940307@163.com>	2025-08-03 22:16:38 -07:00
li chaoran	fe5086fd8b	chore: speedup NPU CI by cache (#8270 ) Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com> Co-authored-by: ronnie_zheng <zl19940307@163.com>	2025-07-31 17:29:50 -07:00
Keyang Ru	7c9697178e	[CI]Add genai-bench Performance Validation for PD Router (#8477 ) Co-authored-by: key4ng <rukeyang@gamil.com>	2025-07-28 16:58:23 -07:00
Shangming Cai	70e37b97bf	chore: upgrade mooncake 0.3.5 (#8341 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-07-25 01:17:26 -07:00
michael-amd	0e5fa67773	[AMD] Pull latest image for AMD CI (#8070 )	2025-07-23 17:56:14 -07:00
ronnie_zheng	93d124ef5a	[feature] enable NPU CI (#7935 ) Co-authored-by: Even Zhou <14368888+iforgetmyname@users.noreply.github.com>	2025-07-20 13:12:42 -07:00
Simo Lin	c8f31042a8	[router] Refactor router and policy traits with dependency injection (#7987 ) Co-authored-by: Jin Pan <jpan236@wisc.edu> Co-authored-by: Keru Yang <rukeyang@gmail.com> Co-authored-by: Yingyi Huang <yingyihuang2000@outlook.com> Co-authored-by: Philip Zhu <phlipzhux@gmail.com>	2025-07-18 14:24:24 -07:00
Cheng Wan	02404a1e35	[ci] recover 8-gpu deepep test (#8105 )	2025-07-17 00:46:40 -07:00
Sai Enduri	f06bd210c0	Update amd docker image. (#8045 ) Co-authored-by: Hubert Lu <55214931+hubertlu-tw@users.noreply.github.com>	2025-07-15 15:09:56 -07:00
Hank Han	2117f82def	[ci] CI supports use cached models (#7874 )	2025-07-14 11:42:21 +00:00
Cheng Wan	d487555f84	[CI] Add deepep tests to CI (#7872 )	2025-07-09 01:49:47 -07:00
Kay Yan	975a5ec69c	[fix] update bench_speculative.py for compatibility (#7764 ) Signed-off-by: Kay Yan <kay.yan@daocloud.io>	2025-07-04 16:32:54 +08:00
Lianmin Zheng	22352d47a9	Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632 ) Co-authored-by: Kan Wu <wukanustc@gmail.com>	2025-06-29 23:16:19 -07:00
Hubert Lu	3b3f1e3aeb	[AMD] Add unit-test-sgl-kernel-amd to AMD CI (#7539 )	2025-06-29 15:50:09 -07:00
Keyang Ru	29bd4c8135	[CI] Add CI Testing for Prefill-Decode Disaggregation with Router (#7540 )	2025-06-27 00:18:56 -07:00
Mick	4d67025a1d	chore: improve ci bug reporting (#7542 )	2025-06-26 01:32:44 -07:00
Shangming Cai	a07f8ae4b7	[CI] Upgrade mooncake to v0.3.4.post2 to fix potential slice failed bug (#7522 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-06-25 01:49:22 -07:00
Shangming Cai	d6dddc19ff	[CI] Upgrade mooncake to 0.3.4.post1 to fix 8 gpu tests (#7472 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-06-24 02:10:50 +08:00
kk	bd4f581896	Fix torch compile run (#7391 ) Co-authored-by: wunhuang <wunhuang@amd.com> Co-authored-by: Sai Enduri <saimanas.enduri@amd.com>	2025-06-22 15:33:09 -07:00
Shangming Cai	187b85b7f3	[PD] Optimize custom mem pool usage and bump mooncake version (#7393 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-06-20 09:50:39 -07:00
Lianmin Zheng	0f218731e3	Do not run frontend_reasoning.ipynb to reduce the CI load (#7073 )	2025-06-10 17:15:31 -07:00
Yineng Zhang	56ccd3c22c	chore: upgrade flashinfer v0.2.6.post1 jit (#6958 ) Co-authored-by: alcanderian <alcanderian@gmail.com> Co-authored-by: Qiaolin Yu <qy254@cornell.edu> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com> Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: ispobock <ispobaoke@gmail.com>	2025-06-09 09:22:39 -07:00
Hubert Lu	4740288303	[AMD] Add more tests to per-commit-amd (#6926 )	2025-06-08 01:08:37 -07:00
HAI	b819381fec	AITER backend extension and workload optimizations (#6838 ) Co-authored-by: wunhuang <wunhuang@amd.com> Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>	2025-06-05 23:00:18 -07:00

1 2 3 4

190 Commits