sglang

Author	SHA1	Message	Date
Lianmin Zheng	c480a3f6ea	Minor style fixes for sgl-kernel (#9289 )	2025-08-18 09:38:35 -07:00
michael-amd	0fc8bf2cd4	[AMD] Update fallback images for AMD CI (#9159 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-08-13 20:15:10 -07:00
li chaoran	2ecbd8b8bf	[feat] add ascend readme and docker release (#8700 ) Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com> Signed-off-by: lichaoran <pkwarcraft@gmail.com> Co-authored-by: Even Zhou <even.y.zhou@outlook.com> Co-authored-by: ronnie_zheng <zl19940307@163.com>	2025-08-12 13:25:42 -07:00
Yi Zhang	89f1d4f536	update deepep commit to support qwen3-coder (#9066 )	2025-08-11 10:42:33 -07:00
Cheng Wan	f003cd3548	[CI] Fix CI tests (#9050 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-08-10 23:52:05 -07:00
Lianmin Zheng	2c7f01bc89	Reorganize CI and test files (#9027 )	2025-08-10 12:30:06 -07:00
Lianmin Zheng	706bd69cc5	Clean up server_args.py to have a dedicated function for model specific adjustments (#8983 )	2025-08-08 19:56:50 -07:00
michael-amd	23f2afb2ce	[AMD] Update SGLang image fallback logic for AMD CI (#8980 )	2025-08-08 18:51:29 -07:00
fzyzcjy	482c3db29f	Fix sgl-kernel arch and missing package in CI (#8869 )	2025-08-07 02:08:15 -07:00
michael-amd	4f2e1490c3	[AMD] Pull latest SGLang version for AMD CI (#8787 )	2025-08-06 20:20:26 -07:00
Yineng Zhang	cbbd685a46	chore: use torch 2.8 stable (#8880 )	2025-08-06 15:51:40 -07:00
Cheng Wan	78aad91037	[CI] fix pip upgrade (#8881 )	2025-08-06 15:02:32 -07:00
fzyzcjy	b114a8105b	Support B200 in CI (#8861 )	2025-08-06 21:42:44 +08:00
Yineng Zhang	3ae8e3ea8f	chore: upgrade torch 2.8.0 (#8836 )	2025-08-05 17:32:01 -07:00
kk	32d9e39a29	Fix potential memory fault issue and ncclSystemError in CI test (#8681 ) Co-authored-by: wunhuang <wunhuang@amd.com>	2025-08-05 12:19:37 -07:00
Even Zhou	fee0ab0fba	[CI] Ascend NPU CI enhancement (#8294 ) Co-authored-by: ronnie_zheng <zl19940307@163.com>	2025-08-03 22:16:38 -07:00
li chaoran	fe5086fd8b	chore: speedup NPU CI by cache (#8270 ) Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com> Co-authored-by: ronnie_zheng <zl19940307@163.com>	2025-07-31 17:29:50 -07:00
Keyang Ru	7c9697178e	[CI]Add genai-bench Performance Validation for PD Router (#8477 ) Co-authored-by: key4ng <rukeyang@gamil.com>	2025-07-28 16:58:23 -07:00
Shangming Cai	70e37b97bf	chore: upgrade mooncake 0.3.5 (#8341 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-07-25 01:17:26 -07:00
michael-amd	0e5fa67773	[AMD] Pull latest image for AMD CI (#8070 )	2025-07-23 17:56:14 -07:00
ronnie_zheng	93d124ef5a	[feature] enable NPU CI (#7935 ) Co-authored-by: Even Zhou <14368888+iforgetmyname@users.noreply.github.com>	2025-07-20 13:12:42 -07:00
Simo Lin	c8f31042a8	[router] Refactor router and policy traits with dependency injection (#7987 ) Co-authored-by: Jin Pan <jpan236@wisc.edu> Co-authored-by: Keru Yang <rukeyang@gmail.com> Co-authored-by: Yingyi Huang <yingyihuang2000@outlook.com> Co-authored-by: Philip Zhu <phlipzhux@gmail.com>	2025-07-18 14:24:24 -07:00
Cheng Wan	02404a1e35	[ci] recover 8-gpu deepep test (#8105 )	2025-07-17 00:46:40 -07:00
Sai Enduri	f06bd210c0	Update amd docker image. (#8045 ) Co-authored-by: Hubert Lu <55214931+hubertlu-tw@users.noreply.github.com>	2025-07-15 15:09:56 -07:00
Hank Han	2117f82def	[ci] CI supports use cached models (#7874 )	2025-07-14 11:42:21 +00:00
Cheng Wan	d487555f84	[CI] Add deepep tests to CI (#7872 )	2025-07-09 01:49:47 -07:00
Kay Yan	975a5ec69c	[fix] update bench_speculative.py for compatibility (#7764 ) Signed-off-by: Kay Yan <kay.yan@daocloud.io>	2025-07-04 16:32:54 +08:00
Lianmin Zheng	22352d47a9	Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632 ) Co-authored-by: Kan Wu <wukanustc@gmail.com>	2025-06-29 23:16:19 -07:00
Hubert Lu	3b3f1e3aeb	[AMD] Add unit-test-sgl-kernel-amd to AMD CI (#7539 )	2025-06-29 15:50:09 -07:00
Keyang Ru	29bd4c8135	[CI] Add CI Testing for Prefill-Decode Disaggregation with Router (#7540 )	2025-06-27 00:18:56 -07:00
Mick	4d67025a1d	chore: improve ci bug reporting (#7542 )	2025-06-26 01:32:44 -07:00
Shangming Cai	a07f8ae4b7	[CI] Upgrade mooncake to v0.3.4.post2 to fix potential slice failed bug (#7522 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-06-25 01:49:22 -07:00
Shangming Cai	d6dddc19ff	[CI] Upgrade mooncake to 0.3.4.post1 to fix 8 gpu tests (#7472 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-06-24 02:10:50 +08:00
kk	bd4f581896	Fix torch compile run (#7391 ) Co-authored-by: wunhuang <wunhuang@amd.com> Co-authored-by: Sai Enduri <saimanas.enduri@amd.com>	2025-06-22 15:33:09 -07:00
Shangming Cai	187b85b7f3	[PD] Optimize custom mem pool usage and bump mooncake version (#7393 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-06-20 09:50:39 -07:00
Lianmin Zheng	0f218731e3	Do not run frontend_reasoning.ipynb to reduce the CI load (#7073 )	2025-06-10 17:15:31 -07:00
Yineng Zhang	56ccd3c22c	chore: upgrade flashinfer v0.2.6.post1 jit (#6958 ) Co-authored-by: alcanderian <alcanderian@gmail.com> Co-authored-by: Qiaolin Yu <qy254@cornell.edu> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com> Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: ispobock <ispobaoke@gmail.com>	2025-06-09 09:22:39 -07:00
Hubert Lu	4740288303	[AMD] Add more tests to per-commit-amd (#6926 )	2025-06-08 01:08:37 -07:00
HAI	b819381fec	AITER backend extension and workload optimizations (#6838 ) Co-authored-by: wunhuang <wunhuang@amd.com> Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>	2025-06-05 23:00:18 -07:00
Lianmin Zheng	20fd53b8f6	Correctly abort the failed grammar requests & Improve the handling of abort (#6803 )	2025-06-01 19:00:07 -07:00
Sai Enduri	f4a8987f69	Update amd docker and nightly models. (#6687 )	2025-05-28 00:08:08 -07:00
Yineng Zhang	f77da69964	chore: upgrade mooncake-transfer-engine (#6643 )	2025-05-26 20:01:30 -07:00
Sai Enduri	eb8f02dd87	Update nightly thresholds and dependencies. (#6635 )	2025-05-26 11:44:13 -07:00
fzyzcjy	25be63d0b2	Auto handle PD disaggregation in bench_serving (#6587 ) Co-authored-by: yizhang2077 <1109276519@qq.com>	2025-05-25 22:41:27 -07:00
fzyzcjy	d502dae0f0	Tiny change killall_sglang.sh (#6596 )	2025-05-25 22:36:51 -07:00
kk	7a5e6ce1cb	Fix GPU OOM (#6564 ) Co-authored-by: michael <michael.zhang@amd.com>	2025-05-24 16:38:39 -07:00
Byron Hsu	2d831c6ef9	[PD] Support structured output (#6560 )	2025-05-23 21:49:00 -07:00
Byron Hsu	8233cc10fd	[PD] Support logprob & Add failure test (#6558 )	2025-05-23 14:29:20 -07:00
HAI	5c0b38f369	aiter attention-backend (default enabled on AMD/ROCm) (#6381 )	2025-05-20 22:52:41 -07:00
Yineng Zhang	eabcf82acb	feat: add long context example (#6391 )	2025-05-18 01:45:17 -07:00

1 2 3 4

179 Commits