Commit Graph

262 Commits

Author SHA1 Message Date
jacky.cheng
b00a0c786f [Fix] Update to v0.1.5.post4 and refine HIP attention backend selection (#11161) 2025-10-02 21:19:30 -07:00
gongwei-130
0618ad6dd5 fix: shoudn't include CUDA_ARCH 100 and 120 for cuda12.6.1 (#11176) 2025-10-02 13:24:23 -07:00
ishandhanani
47488cc353 docker: x86 dev builds for hopper and blackwell (#11075) 2025-10-01 00:06:38 -07:00
Hubert Lu
01a26544a3 [AMD] Add Tilelang and Fast Hadamard Transform builds to Dockerfile.rocm (#11114) 2025-09-30 20:00:37 -07:00
hlu1
592ddf374f Add simple docker file for B300 (#10944)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
2025-09-26 17:26:57 -07:00
ishandhanani
adba172fd1 ci: free space on workers for build (#10786)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-09-24 02:58:22 -07:00
sogalin
7135db5d31 [ROCm] Update aiter to v0.1.5.post3 (#10812) 2025-09-23 10:54:12 -07:00
ishandhanani
b06db198ba followup: clean up dockerfiles and release yamls (#10783) 2025-09-23 00:19:46 -07:00
ishandhanani
1c82d9db28 feat: unify dockerfiles (#10705)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-09-22 23:23:48 -07:00
Shangming Cai
74cd6e3902 chore: upgrade mooncake 0.3.6.post1 to fix gb200 dockerfile (#10681)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-20 00:12:26 -07:00
Baizhou Zhang
3fa3c22ae2 Fix fast decode plan for flashinfer v0.4.0rc1 and upgrade sgl-kernel 0.3.11 (#10634)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-09-19 01:25:29 -07:00
Yi Zhang
e07b21ceaf update deepep version for qwen3-next deepep moe (#10624) 2025-09-18 11:35:22 -07:00
Shangming Cai
60fc5b51f6 chore: upgrade mooncake 0.3.6 (#10596)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
2025-09-18 00:19:30 -07:00
HAI
d500eb9173 aiter v0.1.5.post2 (#10563) 2025-09-17 22:10:45 -07:00
kyleliang-nv
e1d45bc280 Fix decord dependency for aarch64 docker build (#10529) 2025-09-16 17:34:37 -07:00
Yineng Zhang
c0c6f543e4 chore: upgrade sgl-kernel 0.3.10 (#10500) 2025-09-16 02:00:53 -07:00
Yineng Zhang
86a32bb5cd chore: bump v0.5.3rc0 (#10468) 2025-09-15 03:55:18 -07:00
Yineng Zhang
5afd036533 feat: support pip install sglang (#10465) 2025-09-15 03:09:17 -07:00
kk
eca59f96c3 Update ROCm docker image to add sgl-router support (#10406)
Co-authored-by: Colin Wang <kangwang@amd.com>
2025-09-13 10:52:18 -07:00
Even Zhou
16cd550c85 Support Qwen3-Next on Ascend NPU (#10379) 2025-09-12 16:31:37 -07:00
Yineng Zhang
cef11e9a55 fix: resolve gb200 image link (#10343) 2025-09-12 13:46:02 -07:00
Yineng Zhang
b0d25e72c4 chore: bump v0.5.2 (#10221) 2025-09-11 16:09:20 -07:00
Yineng Zhang
bfe01a5eef chore: upgrade v0.3.9.post2 sgl-kernel (#10297) 2025-09-11 04:10:29 -07:00
Zaili Wang
ef959d7b85 [CPU] fix OOM when mem-fraction is not set (#9090) 2025-09-10 23:52:22 -07:00
BourneSun0527
4aa1e69bc7 [chore]Add sgl-router to npu images (#10229) 2025-09-10 23:51:16 -07:00
Lianmin Zheng
bcf1955f7e Revert "chore: upgrade v0.3.9 sgl-kernel" (#10245) 2025-09-09 19:05:20 -07:00
Yineng Zhang
d3ee70985f chore: upgrade v0.3.9 sgl-kernel (#10220) 2025-09-09 03:16:25 -07:00
ishandhanani
148022fc36 gb200: update dockerfile to latest kernel (#9522) 2025-09-08 17:32:36 -07:00
Liangsheng Yin
6e95f5e5bd Simplify Router arguments passing and build it in docker image (#9964) 2025-09-05 12:13:55 +08:00
Yineng Zhang
0e9387a95d fix: update gb200 dep (#10052) 2025-09-04 20:30:46 -07:00
Yineng Zhang
fa9c82d339 chore: bump v0.5.2rc2 (#10050) 2025-09-04 20:07:27 -07:00
kk
b648d86216 [Fix] gpt-oss mxfp4 model run failed on ROCm platform (#9994)
Co-authored-by: wunhuang <wunhuang@amd.com>
2025-09-03 22:34:17 -07:00
Yineng Zhang
18f91eb639 chore: bump v0.5.2rc1 (#9920) 2025-09-02 04:43:34 -07:00
JieXin Liang
1db649ac02 [feat] apply deep_gemm compile_mode to skip launch (#9879) 2025-09-02 03:20:30 -07:00
Yineng Zhang
16e56ea693 chore: bump v0.5.2rc0 (#9862) 2025-09-01 03:07:36 -07:00
Yineng Zhang
9970e3bf32 chore: upgrade sgl-kernel 0.3.7.post1 with deepgemm fix (#9822) 2025-08-30 04:02:25 -07:00
Yineng Zhang
9c99949ef3 chore: update Dockerfile (#9820) 2025-08-30 03:08:14 -07:00
sogalin
4b7034ddb0 ROCm 7.0 update (#9757) 2025-08-28 22:24:34 -07:00
Yineng Zhang
bc80dc4ce0 chore: bump v0.5.1.post3 (#9716) 2025-08-27 15:42:42 -07:00
fzyzcjy
44ffe2cb72 Install py-spy by default for containers for easier debugging (#9649) 2025-08-26 10:40:52 -07:00
Yineng Zhang
bf863e3bbf fix: use sgl-kernel 0.3.5 (#9565) 2025-08-24 15:46:47 -07:00
gongwei-130
fb107cfd75 feat: allow use local branch to build image (#9546) 2025-08-23 16:38:30 -07:00
kk
988accbc1e Update docker file for supporting PD-Disaggregation on MI300x (#9494)
Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: Colin Wang <kangwang@amd.com>
2025-08-21 23:48:40 -07:00
Yineng Zhang
b6b2287e4b chore: bump sgl-kernel v0.3.6.post2 (#9475) 2025-08-21 23:02:08 -07:00
Chang Su
7638f5e44e [router] Implement gRPC SGLangSchedulerClient (#9364) 2025-08-19 16:44:11 -07:00
Yineng Zhang
a1c7f742f9 chore: bump sgl-kernel v0.3.6.post1 (#9286) 2025-08-17 16:26:17 -07:00
Even Zhou
fda762a27d [Bugfix] Change vLLM install order & Add A2 support (#9232) 2025-08-16 22:36:14 -07:00
Yineng Zhang
87dab54824 Revert "chore: bump sgl-kernel v0.3.6 (#9220)" (#9247) 2025-08-15 17:24:36 -07:00
Yineng Zhang
5121af4627 Revert "chore(docker): update sgl_kernel version to 0.3.6 in Dockerfi… (#9246) 2025-08-15 17:19:38 -07:00
ishandhanani
e52c3866eb chore(docker): update sgl_kernel version to 0.3.6 in Dockerfile.gb200 (#9243) 2025-08-15 12:06:52 -07:00