Commit Graph

284 Commits

Author SHA1 Message Date
sglang-bot
1053e1be17 chore: bump SGLang version to 0.5.4 (#12027)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2025-10-23 18:01:40 -07:00
nvjullin
9a71500cfb Fixed aarch64 flash-mla (#12009) 2025-10-23 17:47:04 -07:00
HAI
65d376b491 aiter update to v0.1.6.post1 (#12004) 2025-10-22 23:53:05 -07:00
fzyzcjy
8612811d85 Bump grace blackwell DeepEP version (#11990) 2025-10-22 21:08:12 -07:00
Baizhou Zhang
ebff4ee648 Update sgl-kernel and remove fast hadamard depedency (#11844) 2025-10-21 13:13:54 -07:00
Meng, Hengyu
b113c72e7a Init attention backend for Intel XPU (#10656)
Co-authored-by: guangyey <guangye.yu@intel.com>
Co-authored-by: DiweiSun <105627594+DiweiSun@users.noreply.github.com>
2025-10-21 11:41:28 +08:00
fzyzcjy
9e3be1fa2a Tiny bump DeepEP version in ARM blackwell (#11810) 2025-10-20 08:15:14 +08:00
kyleliang-nv
fda0cb2a30 Fix Dockerfile not installing correct version of DeepEP for arm build (#11773) 2025-10-18 15:06:05 -07:00
sglang-bot
85ebeecf06 chore: bump SGLang version to 0.5.3.post3 (#11693)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2025-10-16 13:14:55 -07:00
Even Zhou
3cceaa381a [Bugfix] Fix Qwen3/DSV3/DSV3.2 model support (#11510) 2025-10-16 15:14:09 +08:00
DiweiSun
4c03dbaaef [CI][XPU]enable sglang CI on Intel XPU (#9493)
Co-authored-by: huaiyuzh <huaiyu.zheng@intel.com>
Co-authored-by: Ma Mingfei <mingfei.ma@intel.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-10-15 17:13:19 -07:00
sglang-bot
baf277a9bf chore: bump SGLang version to 0.5.3.post2 (#11680)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2025-10-15 16:49:14 -07:00
Yineng Zhang
86373b9e48 fix: Update SGL_KERNEL_VERSION to 0.3.15 (#11633) 2025-10-14 14:45:28 -07:00
Arthur Cheng
6dc9ca8c85 [router] Add BRANCH_TYPE=local support to Dockerfile.router for local builds (#11571) 2025-10-13 16:10:51 -07:00
Zaili Wang
0a304870e8 fix Xeon CI (#11454) 2025-10-11 14:08:28 -07:00
Baizhou Zhang
8b85926a6e Remove tilelang dependency in Dockerfile (#11455) 2025-10-10 23:17:53 -07:00
Zaili Wang
f19613e6c3 Dedicated toml files for CPU/XPU (#10734) 2025-10-10 00:44:55 -07:00
sglang-bot
758b887ad1 chore: bump SGLang version to 0.5.3.post1 (#11324) 2025-10-09 15:19:59 -07:00
gongwei-130
4aeb193fbd disable sm100 for FlashMLA and fast-hadamard-transform in cuda12.6.1 (#11274) 2025-10-06 14:48:31 -07:00
sglang-bot
a4a3d82393 chore: bump SGLang version to 0.5.3 (#11263) 2025-10-06 20:07:02 +08:00
sglang-bot
0b13cbb7c9 chore: bump SGLang version to 0.5.3rc2 (#11259)
Co-authored-by: sglang-bot <sglang-bot@users.noreply.github.com>
2025-10-06 01:12:10 -07:00
Baizhou Zhang
292a867ad9 Add flashmla and fast hadamard transform to Dockerfile (#11235) 2025-10-05 21:31:28 -07:00
jacky.cheng
b00a0c786f [Fix] Update to v0.1.5.post4 and refine HIP attention backend selection (#11161) 2025-10-02 21:19:30 -07:00
gongwei-130
0618ad6dd5 fix: shoudn't include CUDA_ARCH 100 and 120 for cuda12.6.1 (#11176) 2025-10-02 13:24:23 -07:00
ishandhanani
47488cc353 docker: x86 dev builds for hopper and blackwell (#11075) 2025-10-01 00:06:38 -07:00
Hubert Lu
01a26544a3 [AMD] Add Tilelang and Fast Hadamard Transform builds to Dockerfile.rocm (#11114) 2025-09-30 20:00:37 -07:00
hlu1
592ddf374f Add simple docker file for B300 (#10944)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
2025-09-26 17:26:57 -07:00
ishandhanani
adba172fd1 ci: free space on workers for build (#10786)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-09-24 02:58:22 -07:00
sogalin
7135db5d31 [ROCm] Update aiter to v0.1.5.post3 (#10812) 2025-09-23 10:54:12 -07:00
ishandhanani
b06db198ba followup: clean up dockerfiles and release yamls (#10783) 2025-09-23 00:19:46 -07:00
ishandhanani
1c82d9db28 feat: unify dockerfiles (#10705)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-09-22 23:23:48 -07:00
Shangming Cai
74cd6e3902 chore: upgrade mooncake 0.3.6.post1 to fix gb200 dockerfile (#10681)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-20 00:12:26 -07:00
Baizhou Zhang
3fa3c22ae2 Fix fast decode plan for flashinfer v0.4.0rc1 and upgrade sgl-kernel 0.3.11 (#10634)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-09-19 01:25:29 -07:00
Yi Zhang
e07b21ceaf update deepep version for qwen3-next deepep moe (#10624) 2025-09-18 11:35:22 -07:00
Shangming Cai
60fc5b51f6 chore: upgrade mooncake 0.3.6 (#10596)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-18 00:19:30 -07:00
HAI
d500eb9173 aiter v0.1.5.post2 (#10563) 2025-09-17 22:10:45 -07:00
kyleliang-nv
e1d45bc280 Fix decord dependency for aarch64 docker build (#10529) 2025-09-16 17:34:37 -07:00
Yineng Zhang
c0c6f543e4 chore: upgrade sgl-kernel 0.3.10 (#10500) 2025-09-16 02:00:53 -07:00
Yineng Zhang
86a32bb5cd chore: bump v0.5.3rc0 (#10468) 2025-09-15 03:55:18 -07:00
Yineng Zhang
5afd036533 feat: support pip install sglang (#10465) 2025-09-15 03:09:17 -07:00
kk
eca59f96c3 Update ROCm docker image to add sgl-router support (#10406)
Co-authored-by: Colin Wang <kangwang@amd.com>
2025-09-13 10:52:18 -07:00
Even Zhou
16cd550c85 Support Qwen3-Next on Ascend NPU (#10379) 2025-09-12 16:31:37 -07:00
Yineng Zhang
cef11e9a55 fix: resolve gb200 image link (#10343) 2025-09-12 13:46:02 -07:00
Yineng Zhang
b0d25e72c4 chore: bump v0.5.2 (#10221) 2025-09-11 16:09:20 -07:00
Yineng Zhang
bfe01a5eef chore: upgrade v0.3.9.post2 sgl-kernel (#10297) 2025-09-11 04:10:29 -07:00
Zaili Wang
ef959d7b85 [CPU] fix OOM when mem-fraction is not set (#9090) 2025-09-10 23:52:22 -07:00
BourneSun0527
4aa1e69bc7 [chore]Add sgl-router to npu images (#10229) 2025-09-10 23:51:16 -07:00
Lianmin Zheng
bcf1955f7e Revert "chore: upgrade v0.3.9 sgl-kernel" (#10245) 2025-09-09 19:05:20 -07:00
Yineng Zhang
d3ee70985f chore: upgrade v0.3.9 sgl-kernel (#10220) 2025-09-09 03:16:25 -07:00
ishandhanani
148022fc36 gb200: update dockerfile to latest kernel (#9522) 2025-09-08 17:32:36 -07:00