Commit Graph

500 Commits

Author SHA1 Message Date
Lianmin Zheng
a17e70f5cc Use more general heuristics to set the default value of --mem-fraction-static (#10975)
Co-authored-by: sglang-bot <sglangbot@gmail.com>
2025-09-29 10:11:03 -07:00
Xiaoyu Zhang
6f16bf9d9d [Ci Monitor] Auto uploaded performance data to sglang_ci_data repo (#10976) 2025-09-29 16:17:27 +08:00
Xiaoyu Zhang
11965b0daf Fix sgl-kernel benchmark dead code (#11022) 2025-09-29 15:06:40 +08:00
Kangyan-Zhou
0c9174108a Unify SGL Kernel Releases (#10701) 2025-09-28 19:48:28 -07:00
Xiaoyu Zhang
2387c22b56 Ci monitor support performance (#10965) 2025-09-27 09:11:21 +08:00
Mick
777eb53897 ci: refactor nightly test (#10495) 2025-09-26 15:24:30 -07:00
Xiaoyu Zhang
05a3526654 Restruct gpu_memory_settings in a unify function and relax max_cuda_graph_bs (#10372)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: sglang-bot <sglangbot@gmail.com>
2025-09-26 15:10:49 -07:00
Mick
fff7fbabe6 ci: fix rate-limit of huggingface with hf auth login (#10947) 2025-09-26 11:02:44 -07:00
Sahithi Chigurupati
c3d2ad4ee6 CI: Fix docker manifest build (#10936) 2025-09-25 23:22:55 -07:00
Lianmin Zheng
3e95aa1a09 Remove pull_request trigger from CI monitor workflow (#10932) 2025-09-25 19:40:38 -07:00
Xiaoyu Zhang
c4197e99bb [ci] add ci-monitor workflow (#10898) 2025-09-25 19:29:47 -07:00
ishandhanani
adba172fd1 ci: free space on workers for build (#10786)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-09-24 02:58:22 -07:00
Lianmin Zheng
b1f0fc1c0b Add CI timeout guidelines (#10829) 2025-09-23 22:08:02 -07:00
Shangming Cai
23632d350c Fix latest main ci (#10799)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-23 12:46:13 -07:00
Simo Lin
ddab4fc7c7 [router] fix cache aware routing strategy and lock contention (#10773) 2025-09-23 08:53:49 -07:00
ishandhanani
b06db198ba followup: clean up dockerfiles and release yamls (#10783) 2025-09-23 00:19:46 -07:00
ishandhanani
1c82d9db28 feat: unify dockerfiles (#10705)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-09-22 23:23:48 -07:00
Simo Lin
c3a1d7759f [router] remove pd router draining channel (#10767) 2025-09-22 20:49:33 -07:00
Simo Lin
7ca1bea63d [router] update ci so only execute benchmarks when labels are added (#10757) 2025-09-22 13:23:07 -07:00
sglang-bot
fc3e542009 Update release-docs.yml (#10706) 2025-09-21 00:22:21 -07:00
Yineng Zhang
ba94b82986 fix: update run_suite (#10685) 2025-09-20 01:22:06 -07:00
Shangming Cai
74cd6e3902 chore: upgrade mooncake 0.3.6.post1 to fix gb200 dockerfile (#10681)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-20 00:12:26 -07:00
Yineng Zhang
6f993e8b9e chore: cleanup docker image (#10671) 2025-09-19 16:56:49 -07:00
Shangming Cai
60fc5b51f6 chore: upgrade mooncake 0.3.6 (#10596)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-18 00:19:30 -07:00
kyleliang-nv
e1d45bc280 Fix decord dependency for aarch64 docker build (#10529) 2025-09-16 17:34:37 -07:00
fzyzcjy
ae4be601c2 Fix CI when sgl-kernel is changed but srt is not changed (#10515) 2025-09-16 02:49:54 -07:00
Yineng Zhang
5207424014 chore: bump v0.3.10 sgl-kernel (#10478) 2025-09-15 15:20:09 -07:00
Sahithi Chigurupati
79acec4fe7 [CI] Fix runner for sgl-kernel (#9887)
Signed-off-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
2025-09-15 10:55:48 -07:00
Yineng Zhang
5afd036533 feat: support pip install sglang (#10465) 2025-09-15 03:09:17 -07:00
Lianmin Zheng
50dc0c1e9c Run tests based on labels (#10456) 2025-09-15 00:29:20 -07:00
Lianmin Zheng
f73aae0bfc Update GITHUB_TOKEN secret for documentation push (#10458) 2025-09-14 21:59:13 -07:00
Lianmin Zheng
b354e3c90d [CI] Fix token key in label-pr.yml workflow (#10452) 2025-09-14 20:45:53 -07:00
Lianmin Zheng
65e6f48ce4 Update permissions in label-pr.yml (#10450) 2025-09-14 20:41:43 -07:00
Lianmin Zheng
0ec580a86c Fix label PR (#10445) 2025-09-14 20:33:09 -07:00
Lianmin Zheng
8f6a175803 Fix label pr for ci (#10441) 2025-09-14 19:48:06 -07:00
Lianmin Zheng
b7d385e812 automatically label pr for ci (#10435) 2025-09-14 19:13:11 -07:00
Jintao Zhang
f9ee6ae17a [router]: Add Embedding routing logic (#10129)
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
Co-authored-by: Waël Boukhobza <wawa_wael@live.fr>
2025-09-14 18:44:35 -07:00
Yineng Zhang
7ce6c10eb6 fix: enable cu124 and cu128 build on main push (#10431) 2025-09-14 16:19:35 -07:00
fzyzcjy
e3cf812f7d Fix sgl-kernel + srt CI (#10419) 2025-09-14 01:44:47 -07:00
fzyzcjy
a0f844ed5a Let sgl-kernel changes be tested on srt (#10313) 2025-09-14 01:09:17 -07:00
Even Zhou
16cd550c85 Support Qwen3-Next on Ascend NPU (#10379) 2025-09-12 16:31:37 -07:00
Yineng Zhang
9d775b1a2d feat: add deepseek v3 fp4 ut (#10391) 2025-09-12 15:43:29 -07:00
Simo Lin
07bcad7fb7 [bug] fix router ci syntax error (#10390) 2025-09-12 14:39:15 -07:00
Simo Lin
8c86595c93 [router] enable sccache in ci and local build (#10099) 2025-09-12 09:43:48 -07:00
Yineng Zhang
b3839a7f99 fix: resolve transfer_kv_all_layer_direct_lf_pf import error (#10360) 2025-09-11 23:53:23 -07:00
Keyang Ru
7b141f816c [router][ci] Add gpu utilization analyze with nvml (#10345) 2025-09-11 19:26:02 -07:00
Yineng Zhang
b0d25e72c4 chore: bump v0.5.2 (#10221) 2025-09-11 16:09:20 -07:00
Keyang Ru
1ee11df8ac [router][ci] add gpu process check and free port before start server (#10338) 2025-09-11 14:24:16 -07:00
Keyang Ru
480d1b8b20 [router] add benchmark for regular router and pd router (#10280) 2025-09-11 12:04:11 -07:00
Yineng Zhang
bfe01a5eef chore: upgrade v0.3.9.post2 sgl-kernel (#10297) 2025-09-11 04:10:29 -07:00