Commit Graph

244 Commits

Author SHA1 Message Date
kk
eca59f96c3 Update ROCm docker image to add sgl-router support (#10406)
Co-authored-by: Colin Wang <kangwang@amd.com>
2025-09-13 10:52:18 -07:00
Even Zhou
16cd550c85 Support Qwen3-Next on Ascend NPU (#10379) 2025-09-12 16:31:37 -07:00
Yineng Zhang
cef11e9a55 fix: resolve gb200 image link (#10343) 2025-09-12 13:46:02 -07:00
Yineng Zhang
b0d25e72c4 chore: bump v0.5.2 (#10221) 2025-09-11 16:09:20 -07:00
Yineng Zhang
bfe01a5eef chore: upgrade v0.3.9.post2 sgl-kernel (#10297) 2025-09-11 04:10:29 -07:00
Zaili Wang
ef959d7b85 [CPU] fix OOM when mem-fraction is not set (#9090) 2025-09-10 23:52:22 -07:00
BourneSun0527
4aa1e69bc7 [chore]Add sgl-router to npu images (#10229) 2025-09-10 23:51:16 -07:00
Lianmin Zheng
bcf1955f7e Revert "chore: upgrade v0.3.9 sgl-kernel" (#10245) 2025-09-09 19:05:20 -07:00
Yineng Zhang
d3ee70985f chore: upgrade v0.3.9 sgl-kernel (#10220) 2025-09-09 03:16:25 -07:00
ishandhanani
148022fc36 gb200: update dockerfile to latest kernel (#9522) 2025-09-08 17:32:36 -07:00
Liangsheng Yin
6e95f5e5bd Simplify Router arguments passing and build it in docker image (#9964) 2025-09-05 12:13:55 +08:00
Yineng Zhang
0e9387a95d fix: update gb200 dep (#10052) 2025-09-04 20:30:46 -07:00
Yineng Zhang
fa9c82d339 chore: bump v0.5.2rc2 (#10050) 2025-09-04 20:07:27 -07:00
kk
b648d86216 [Fix] gpt-oss mxfp4 model run failed on ROCm platform (#9994)
Co-authored-by: wunhuang <wunhuang@amd.com>
2025-09-03 22:34:17 -07:00
Yineng Zhang
18f91eb639 chore: bump v0.5.2rc1 (#9920) 2025-09-02 04:43:34 -07:00
JieXin Liang
1db649ac02 [feat] apply deep_gemm compile_mode to skip launch (#9879) 2025-09-02 03:20:30 -07:00
Yineng Zhang
16e56ea693 chore: bump v0.5.2rc0 (#9862) 2025-09-01 03:07:36 -07:00
Yineng Zhang
9970e3bf32 chore: upgrade sgl-kernel 0.3.7.post1 with deepgemm fix (#9822) 2025-08-30 04:02:25 -07:00
Yineng Zhang
9c99949ef3 chore: update Dockerfile (#9820) 2025-08-30 03:08:14 -07:00
sogalin
4b7034ddb0 ROCm 7.0 update (#9757) 2025-08-28 22:24:34 -07:00
Yineng Zhang
bc80dc4ce0 chore: bump v0.5.1.post3 (#9716) 2025-08-27 15:42:42 -07:00
fzyzcjy
44ffe2cb72 Install py-spy by default for containers for easier debugging (#9649) 2025-08-26 10:40:52 -07:00
Yineng Zhang
bf863e3bbf fix: use sgl-kernel 0.3.5 (#9565) 2025-08-24 15:46:47 -07:00
gongwei-130
fb107cfd75 feat: allow use local branch to build image (#9546) 2025-08-23 16:38:30 -07:00
kk
988accbc1e Update docker file for supporting PD-Disaggregation on MI300x (#9494)
Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: Colin Wang <kangwang@amd.com>
2025-08-21 23:48:40 -07:00
Yineng Zhang
b6b2287e4b chore: bump sgl-kernel v0.3.6.post2 (#9475) 2025-08-21 23:02:08 -07:00
Chang Su
7638f5e44e [router] Implement gRPC SGLangSchedulerClient (#9364) 2025-08-19 16:44:11 -07:00
Yineng Zhang
a1c7f742f9 chore: bump sgl-kernel v0.3.6.post1 (#9286) 2025-08-17 16:26:17 -07:00
Even Zhou
fda762a27d [Bugfix] Change vLLM install order & Add A2 support (#9232) 2025-08-16 22:36:14 -07:00
Yineng Zhang
87dab54824 Revert "chore: bump sgl-kernel v0.3.6 (#9220)" (#9247) 2025-08-15 17:24:36 -07:00
Yineng Zhang
5121af4627 Revert "chore(docker): update sgl_kernel version to 0.3.6 in Dockerfi… (#9246) 2025-08-15 17:19:38 -07:00
ishandhanani
e52c3866eb chore(docker): update sgl_kernel version to 0.3.6 in Dockerfile.gb200 (#9243) 2025-08-15 12:06:52 -07:00
Yineng Zhang
c186feed7f chore: bump sgl-kernel v0.3.6 (#9220) 2025-08-15 02:50:50 -07:00
fzyzcjy
f8644a5632 Tiny update tmux history limit on dev container (#9218) 2025-08-15 00:22:08 -07:00
fzyzcjy
392de007cb Minor fix docker container DeepEP on multi platforms (#9205) 2025-08-14 17:41:49 -07:00
Yineng Zhang
1fea998a45 chore: bump sgl-kernel v0.3.5 (#9185) 2025-08-14 03:20:48 -07:00
fzyzcjy
b3363cc1aa Fix docker container DeepEP error on Blackwell (#9171) 2025-08-13 21:06:48 -07:00
Yineng Zhang
71fb8c9527 feat: update fa3 (#9126) 2025-08-13 20:07:08 +08:00
kk
35e6bc92e3 Update docker file for MI35x base image update to support gpt-oss mxfp4 model (#9111)
Co-authored-by: wunhuang <wunhuang@amd.com>
2025-08-13 00:55:31 -07:00
Yineng Zhang
924827c3de chore: use cp310 (#9130) 2025-08-12 15:33:22 -07:00
Yineng Zhang
c81daf838d fix: update Dockerfile (#9129) 2025-08-12 15:01:29 -07:00
li chaoran
2ecbd8b8bf [feat] add ascend readme and docker release (#8700)
Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com>
Signed-off-by: lichaoran <pkwarcraft@gmail.com>
Co-authored-by: Even Zhou <even.y.zhou@outlook.com>
Co-authored-by: ronnie_zheng <zl19940307@163.com>
2025-08-12 13:25:42 -07:00
Yineng Zhang
305b27c124 fix: update Dockerfile (#9125) 2025-08-12 13:23:10 -07:00
ishandhanani
1f9ec65374 fix(docker): update sgl_kernel version to 0.3.4 in Dockerfile.gb200 (#9118) 2025-08-12 13:12:33 -07:00
Yineng Zhang
3a9afe2a42 chore: bump sgl-kernel v0.3.4 (#9103) 2025-08-12 01:48:47 -07:00
ishandhanani
9f24dfefd1 chore(gb200): remove ToT flashinfer installation (#9079) 2025-08-11 11:02:15 -07:00
Yi Zhang
89f1d4f536 update deepep commit to support qwen3-coder (#9066) 2025-08-11 10:42:33 -07:00
Yineng Zhang
48b8b4c124 fix nvshmem cu126 (#9001) 2025-08-09 03:34:54 -07:00
ishandhanani
4e7f025219 chore(gb200): update to CUDA 12.9 and improve build process (#8772) 2025-08-08 13:42:47 -07:00
Yineng Zhang
41357e511b chore: update flashinfer (#8958) 2025-08-08 02:15:22 -07:00