Commit Graph

241 Commits

Author SHA1 Message Date
Mick
8bd26dd4e6 ci: fix night-ci with push retry mechanism (#11765) 2025-10-23 11:31:05 -07:00
Johnny
e7aa4664b3 [NVIDIA] Build CUDA 13 (#11299)
Co-authored-by: ishandhanani <ishandhanani@gmail.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
2025-10-22 20:03:12 -07:00
jiahanc
eec9e471ca [NVIDIA] Update to leverage flashinfer trtllm FP4 MOE throughput kernel (#11563)
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
2025-10-22 13:11:16 -07:00
Baizhou Zhang
ebff4ee648 Update sgl-kernel and remove fast hadamard depedency (#11844) 2025-10-21 13:13:54 -07:00
Shangming Cai
f3cd5d2510 [CI] Fix b200 flashinfer installation (#11915)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-10-21 22:28:50 +08:00
Shangming Cai
05d3667ab9 [CI] disable glm4.1v and fix the flashinfer installation (#11902)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-10-21 18:38:35 +08:00
Hank Han
c1e1600373 [fix] fix ci uv install dependency (#11895) 2025-10-21 16:23:34 +08:00
Xiaoyu Zhang
984fbeb16b Revert "[CI Monitor] Ci monitor only deal with main branch in default" (#11846) 2025-10-19 22:06:40 -07:00
Kangyan-Zhou
53529f46cc Fix version bump script to handle TOML files with outdated versions (#11787)
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-19 18:10:26 -07:00
Xiaoyu Zhang
24ed3f32c0 fix(ci): Fix CI Monitor limit parameter and add CI Analysis to summary (#11832) 2025-10-19 18:08:34 -07:00
b8zhong
f4f8a1b4d8 ci: update lmms-eval to speed up multimodal CI (#11000) 2025-10-19 02:51:19 +08:00
Even Zhou
3cceaa381a [Bugfix] Fix Qwen3/DSV3/DSV3.2 model support (#11510) 2025-10-16 15:14:09 +08:00
Fan Yin
3289da5b41 [sgl-kernel] support hadamard (#11663) 2025-10-15 19:00:44 -07:00
Lianmin Zheng
cd7e1bd591 Sync code and test CI; rename some env vars (#11686) 2025-10-15 18:37:03 -07:00
Xun Sun
a40229f6f8 [1/N] Introduce Mooncake Backend and Mooncake EP to Support Elastic EP (#10423)
Co-authored-by: Hank Han <hanhan7630@outlook.com>
Co-authored-by: Shangming Cai <csmthu@gmail.com>
2025-10-14 19:40:54 -07:00
Sai Enduri
1d08653972 [AMD CI] Add image and weights caching. (#11593) 2025-10-14 02:51:35 -07:00
Xiaoyu Zhang
8e51049f56 [CI Monitor] Ci monitor only deal with main branch in default (#11538) 2025-10-13 13:50:04 -07:00
Baizhou Zhang
9f1f699a7a [CI] Add Basic Test for DeepSeek V3.2 (#11308) 2025-10-13 11:41:02 -07:00
Xiaoyu Zhang
6806c4e63e [CI monitor] Improve CI analyzer: fix job failure tracking and add CUDA-focused filtering (#11505) 2025-10-13 13:31:09 +08:00
Yineng Zhang
05f015f65f chore: remove flashinfer cleanup cache (#11514) 2025-10-12 14:56:33 -07:00
Lianmin Zheng
5a6ec8f999 Fix unit tests (#11503) 2025-10-12 07:45:57 -07:00
Lianmin Zheng
548a57b1f3 Fix port conflicts in CI (#11497) 2025-10-12 06:46:36 -07:00
Liangsheng Yin
20a6c0a63d Beta spec-overlap for EAGLE (#11398)
Co-authored-by: Lianmin Zheng <15100009+merrymercy@users.noreply.github.com>
Co-authored-by: Hanming Lu <69857889+hanming-lu@users.noreply.github.com>
2025-10-12 11:02:22 +08:00
Lianmin Zheng
61055cb309 Reorder PD disagg CI tests (#11438) 2025-10-10 17:56:49 -07:00
Yineng Zhang
4299aebdbb chore: update pyproject (#11420) 2025-10-10 00:56:30 -07:00
Yineng Zhang
d8467db727 fix: reinstall torch in deps install (#11414) 2025-10-09 22:58:18 -07:00
Yineng Zhang
44cb060785 chore: upgrade flashinfer 0.4.0 (#11364) 2025-10-09 14:17:54 -07:00
Lianmin Zheng
0e7b353009 Fix code sync scripts (#11276) 2025-10-06 15:35:01 -07:00
Kangyan-Zhou
8fd41eae93 Improve bot release workflow (#11240) 2025-10-05 21:28:27 -07:00
Lianmin Zheng
366a603e95 Use cu128 for torch audio to fix some CI tests (#11251) 2025-10-05 19:52:32 -07:00
Kangyan-Zhou
a20fc7b7dc Create two new GH workflows to automatically bump SGLang and Kernel version (#10996) 2025-10-05 18:14:05 -07:00
Lianmin Zheng
d645ae90a3 Rename runner labels (#11228) 2025-10-05 18:05:41 -07:00
Cheng Wan
41763ba079 Remove gdrcopy check in ci_install_deepep.sh (#11237) 2025-10-05 17:35:22 -07:00
sunxxuns
5e142484e2 [Fix AMD CI] VRAM cleanup (#11174)
Co-authored-by: root <root@smci350-zts-gtu-e17-15.zts-gtu.dcgpu>
2025-10-05 19:03:53 -04:00
fzyzcjy
fdc4e1e570 Tiny move files to utils folder (#11166) 2025-10-03 22:40:06 +08:00
Xiaoyu Zhang
6f16bf9d9d [Ci Monitor] Auto uploaded performance data to sglang_ci_data repo (#10976) 2025-09-29 16:17:27 +08:00
Xiaoyu Zhang
2387c22b56 Ci monitor support performance (#10965) 2025-09-27 09:11:21 +08:00
Mick
777eb53897 ci: refactor nightly test (#10495) 2025-09-26 15:24:30 -07:00
Mick
fff7fbabe6 ci: fix rate-limit of huggingface with hf auth login (#10947) 2025-09-26 11:02:44 -07:00
Xiaoyu Zhang
c1f39013b7 [ci feature] add ci monitor (#10872) 2025-09-24 23:16:29 -07:00
Lianmin Zheng
38c00ed7a1 Fix multimodal registry and code sync scripts (#10759)
Co-authored-by: cctry <shiyang@x.ai>
2025-09-22 15:36:01 -07:00
Shangming Cai
74cd6e3902 chore: upgrade mooncake 0.3.6.post1 to fix gb200 dockerfile (#10681)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-20 00:12:26 -07:00
Yi Zhang
e07b21ceaf update deepep version for qwen3-next deepep moe (#10624) 2025-09-18 11:35:22 -07:00
Teng Ma
77098aea7b [HiCache] Add tests for hicache storage mooncake backend (#10171)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
Co-authored-by: hzh0425 <hzh0425@apache.org>
Co-authored-by: Shangming Cai <csmthu@gmail.com>
2025-09-18 01:07:16 +08:00
Yineng Zhang
5c08d7d21d fix: resolve sgl-kernel ut (#10476) 2025-09-15 11:42:48 -07:00
Yineng Zhang
5afd036533 feat: support pip install sglang (#10465) 2025-09-15 03:09:17 -07:00
Jintao Zhang
f9ee6ae17a [router]: Add Embedding routing logic (#10129)
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
Co-authored-by: Waël Boukhobza <wawa_wael@live.fr>
2025-09-14 18:44:35 -07:00
fzyzcjy
a0f844ed5a Let sgl-kernel changes be tested on srt (#10313) 2025-09-14 01:09:17 -07:00
fzyzcjy
abea9250da Auto determine sgl kernel version in blackwell CI (#10318) 2025-09-14 01:06:30 -07:00
Even Zhou
16cd550c85 Support Qwen3-Next on Ascend NPU (#10379) 2025-09-12 16:31:37 -07:00