Commit Graph

353 Commits

Author SHA1 Message Date
Hubert Lu
3b3f1e3aeb [AMD] Add unit-test-sgl-kernel-amd to AMD CI (#7539) 2025-06-29 15:50:09 -07:00
Simo Lin
7c0db3a6c5 [bugfix] Remove PR comment posting from Rust benchmark workflow (#7625) 2025-06-28 22:10:01 -07:00
Keyang Ru
29bd4c8135 [CI] Add CI Testing for Prefill-Decode Disaggregation with Router (#7540) 2025-06-27 00:18:56 -07:00
Simo Lin
3abc30364d [ci] add router benchmark script and CI (#7498) 2025-06-25 01:28:25 -07:00
Lianmin Zheng
55e03b10c4 Fix a bug in BatchTokenIDOut & Misc style and dependency updates (#7457) 2025-06-23 06:20:39 -07:00
Yineng Zhang
4d8d9b8efd chore: upgrade mooncake-transfer-engine 0.3.4 (#7401) 2025-06-20 16:38:54 -07:00
ybyang
906dbc34f1 [Docker] optimize dockerfile remove deepep and blackwell merge it to… (#7343)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-06-19 17:42:40 -07:00
DiweiSun
8a10c4c3d9 update ci node for xeon (#7265) 2025-06-16 23:44:08 -07:00
Sai Enduri
62a7aa2efc Update CI flakes. (#7244) 2025-06-16 15:19:32 -07:00
Yineng Zhang
7df7c679b6 feat: use zstd for docker (#7205) 2025-06-14 23:13:29 -07:00
Yineng Zhang
4473320380 chore: bump v0.1.8.post2 (#7189) 2025-06-14 17:01:48 -07:00
Arthur Cheng
baa6624d7c [CI] Add CI workflow for sgl-router docker build (#7027) 2025-06-09 23:16:44 -07:00
Yineng Zhang
1c8b42c84c chore: update pr test xeon (#7018) 2025-06-09 17:36:25 -07:00
Yineng Zhang
7059ae16fb chore: update pr test xeon (#7008) 2025-06-09 10:08:44 -07:00
Yineng Zhang
56ccd3c22c chore: upgrade flashinfer v0.2.6.post1 jit (#6958)
Co-authored-by: alcanderian <alcanderian@gmail.com>
Co-authored-by: Qiaolin Yu <qy254@cornell.edu>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: ispobock <ispobaoke@gmail.com>
2025-06-09 09:22:39 -07:00
Sai Enduri
2c18642502 Enable more unit tests for AMD CI. (#6983) 2025-06-08 19:41:55 -07:00
Yineng Zhang
6c0a48282a chore: bump sgl-kernel v0.1.7 (#6963) 2025-06-08 02:43:15 -07:00
Hubert Lu
4740288303 [AMD] Add more tests to per-commit-amd (#6926) 2025-06-08 01:08:37 -07:00
Sai Enduri
77e928d00e Update server timeout time in AMD CI. (#6953) 2025-06-07 15:10:27 -07:00
HAI
b819381fec AITER backend extension and workload optimizations (#6838)
Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>
2025-06-05 23:00:18 -07:00
Zaili Wang
562f279a2d [CPU] enable CI for PRs, add Dockerfile and auto build task (#6458)
Co-authored-by: diwei sun <diwei.sun@intel.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-06-05 13:43:54 -07:00
fzyzcjy
0166403c20 Support Blackwell DeepEP docker images (#6868) 2025-06-05 00:07:53 -07:00
Junrong Lin
2103b80607 [CI] update verlengine ci to 4-gpu test (#6007) 2025-05-27 14:32:23 -07:00
Yineng Zhang
fc419b62e8 Revert "Tiny fix lint CI does not trigger on master (#6609)" (#6610) 2025-05-25 22:52:34 -07:00
fzyzcjy
84147254c9 Tiny fix lint CI does not trigger on master (#6609) 2025-05-25 22:47:03 -07:00
Shenggui Li
3f23d8cdf1 added support for tied weights in qwen pipeline parallelism (#6546) 2025-05-25 00:00:56 -07:00
kk
7a5e6ce1cb Fix GPU OOM (#6564)
Co-authored-by: michael <michael.zhang@amd.com>
2025-05-24 16:38:39 -07:00
Sai Enduri
24c035f2e3 Temporarily disable MI325x 8 gpu testing. (#6576) 2025-05-24 16:37:22 -07:00
fzyzcjy
505eec4dc9 Tiny make Lint CI show diff (#6445) 2025-05-21 02:06:25 -07:00
HAI
5c0b38f369 aiter attention-backend (default enabled on AMD/ROCm) (#6381) 2025-05-20 22:52:41 -07:00
fzyzcjy
f11481b921 Add 4-GPU runner tests and split existing tests (#6383) 2025-05-18 11:56:51 -07:00
Sai Enduri
c47a51db7e Clean up AMD CI (#6365) 2025-05-18 01:17:28 -07:00
Lianmin Zheng
e07a6977e7 Minor improvements of TokenizerManager / health check (#6327) 2025-05-15 15:29:25 -07:00
Sai Enduri
73eb67c087 Enable unit tests for AMD CI. (#6283) 2025-05-14 12:55:36 -07:00
Sai Enduri
0f5cb8cae1 Enable MI325X AMD CI. (#6259) 2025-05-13 01:49:33 -07:00
Sai Enduri
983c663de6 Update AMD nightly deps. (#6241) 2025-05-12 13:39:20 -07:00
Ying Sheng
bad7c26fdc [PP] Fix init_memory_pool desync & add PP for mixtral (#6223) 2025-05-12 12:38:09 -07:00
Sai Enduri
7d3a3d4510 Update AMD CI docker to v0.4.6.post3-rocm630. (#6213) 2025-05-12 00:00:46 -07:00
Lianmin Zheng
6ea05950b1 Fix release-docs.yml to not use python 3.9 (#6204) 2025-05-11 16:04:55 -07:00
fzyzcjy
e9a47f4cb5 Add dev-deepep docker image (#6198) 2025-05-11 13:17:55 -07:00
Lianmin Zheng
03227c5fa6 [CI] Reorganize the 8 gpu tests (#6192) 2025-05-11 10:55:06 -07:00
Lianmin Zheng
17c36c5511 [CI] Disabled deepep tests temporarily because it takes too much time. (#6186) 2025-05-10 23:40:50 -07:00
shangmingc
31d1f6e7f4 [PD] Add simple unit test for disaggregation feature (#5654)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
2025-05-11 13:35:27 +08:00
Lianmin Zheng
de167cf5fa Fix request abortion (#6184) 2025-05-10 21:54:46 -07:00
Lianmin Zheng
4319978c73 Fix data parallel perf regression (#6183) 2025-05-10 19:18:35 -07:00
Sai Enduri
dff0ab92eb Update amd nightly concurrency. (#6141) 2025-05-09 00:02:14 -07:00
XinyuanTong
e88dd482ed [CI]Add performance CI for VLM (#6038)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-05-07 19:20:03 -07:00
Cheng Wan
9bddf1c82d Deferring 8 GPU test (#6102) 2025-05-07 18:49:58 -07:00
Lianmin Zheng
38053c3372 Fix the timeout for 8 gpu tests (#6084) 2025-05-07 03:13:12 -07:00
Johnny
cb69194562 feat: add release workflow for SGLang kernels on aarch64 (#6010)
Co-authored-by: Qiaolin-Yu <liin1211@outlook.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-05-06 19:42:07 -07:00