Commit Graph

318 Commits

Author SHA1 Message Date
Sai Enduri
983c663de6 Update AMD nightly deps. (#6241) 2025-05-12 13:39:20 -07:00
Ying Sheng
bad7c26fdc [PP] Fix init_memory_pool desync & add PP for mixtral (#6223) 2025-05-12 12:38:09 -07:00
Sai Enduri
7d3a3d4510 Update AMD CI docker to v0.4.6.post3-rocm630. (#6213) 2025-05-12 00:00:46 -07:00
Lianmin Zheng
6ea05950b1 Fix release-docs.yml to not use python 3.9 (#6204) 2025-05-11 16:04:55 -07:00
fzyzcjy
e9a47f4cb5 Add dev-deepep docker image (#6198) 2025-05-11 13:17:55 -07:00
Lianmin Zheng
03227c5fa6 [CI] Reorganize the 8 gpu tests (#6192) 2025-05-11 10:55:06 -07:00
Lianmin Zheng
17c36c5511 [CI] Disabled deepep tests temporarily because it takes too much time. (#6186) 2025-05-10 23:40:50 -07:00
shangmingc
31d1f6e7f4 [PD] Add simple unit test for disaggregation feature (#5654)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
2025-05-11 13:35:27 +08:00
Lianmin Zheng
de167cf5fa Fix request abortion (#6184) 2025-05-10 21:54:46 -07:00
Lianmin Zheng
4319978c73 Fix data parallel perf regression (#6183) 2025-05-10 19:18:35 -07:00
Sai Enduri
dff0ab92eb Update amd nightly concurrency. (#6141) 2025-05-09 00:02:14 -07:00
XinyuanTong
e88dd482ed [CI]Add performance CI for VLM (#6038)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-05-07 19:20:03 -07:00
Cheng Wan
9bddf1c82d Deferring 8 GPU test (#6102) 2025-05-07 18:49:58 -07:00
Lianmin Zheng
38053c3372 Fix the timeout for 8 gpu tests (#6084) 2025-05-07 03:13:12 -07:00
Johnny
cb69194562 feat: add release workflow for SGLang kernels on aarch64 (#6010)
Co-authored-by: Qiaolin-Yu <liin1211@outlook.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-05-06 19:42:07 -07:00
Jinyan Chen
8a828666a3 Add DeepEP to CI PR Test (#5655)
Co-authored-by: Jinyan Chen <jinyanc@nvidia.com>
2025-05-06 17:36:03 -07:00
Sai Enduri
73bc1d00fc Add 1 gpu perf and 2 gpu accuracy tests for AMD MI300x CI. (#5960) 2025-05-01 20:56:59 -07:00
Yineng Zhang
9a6ad8916d chore: upgrade sgl-kernel 0.1.1 (#5933) 2025-04-30 16:13:30 -07:00
Sai Enduri
2afba1b1c1 Add TP2 MOE benchmarks for AMD. (#5909) 2025-04-30 11:38:20 -07:00
Baizhou Zhang
799789afed Bump Flashinfer to 0.2.5 (#5870)
Co-authored-by: Yuhao Chen <yxckeis8@gmail.com>
2025-04-29 19:50:57 -07:00
saienduri
e3a5304475 Add AMD MI300x Nightly Testing. (#5861) 2025-04-29 17:34:32 -07:00
Yineng Zhang
f4c191a712 chore: update Dockerfile (#5894) 2025-04-29 12:55:13 -07:00
HAI
d364b9b0f2 ROCm: update AITER (#5816) 2025-04-28 11:01:20 -07:00
Lianmin Zheng
849c83a0c0 [CI] test chunked prefill more (#5798) 2025-04-28 10:57:17 -07:00
Lianmin Zheng
daed453e84 [CI] Improve github summary & enable fa3 for more models (#5796) 2025-04-27 15:29:46 -07:00
Lianmin Zheng
ded04b2e0a Update nightly-test.yml (#5797) 2025-04-27 15:27:24 -07:00
Baizhou Zhang
f9fb33efc3 Add 8-GPU Test for Deepseek-V3 (#5691)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2025-04-27 12:46:12 -07:00
Lianmin Zheng
35ca04d2fa [CI] fix port conflicts (#5789) 2025-04-27 05:17:44 -07:00
Stefan He
408ba02218 Add Llama 4 to FA3 test (#5509) 2025-04-26 19:49:31 -07:00
saienduri
c5e1026f47 Update amd docker image to sglang:v0.4.5.post3-rocm630. (#5697) 2025-04-26 18:46:57 -07:00
Yineng Zhang
127ff8982e fix torchvision::nms not exist (#5671) 2025-04-23 02:17:21 -07:00
Ke Bao
11b23ae97b Remove extra copy in deepseek forward absorb (#5578)
Co-authored-by: saienduri <saimanas.enduri@amd.com>
2025-04-21 19:33:21 -07:00
lukec
417b44eba8 [Feat] upgrade pytorch2.6 (#5417) 2025-04-20 16:06:34 -07:00
Yineng Zhang
0961feefca feat: use flashinfer jit package (#5547) 2025-04-19 00:28:39 -07:00
Yineng Zhang
88defc4d89 fix: solve release issue (#5434) 2025-04-15 12:58:11 -07:00
Lianmin Zheng
838fa0f218 [minor] cleanup cmakelists.txt (#5420) 2025-04-15 07:07:07 -07:00
Yineng Zhang
11421a3f44 fix: update pr-test-sgl-kernel (#5399) 2025-04-14 21:14:59 -07:00
yhyang201
072df75354 Support for Qwen2.5-VL Model in bitsandbytes Format (#5003) 2025-04-14 02:03:40 -07:00
Yineng Zhang
b62e7e99b8 feat: adapt merge_state (#5337) 2025-04-12 21:14:04 -07:00
Yineng Zhang
75015bb688 ci: update release node (#5333) 2025-04-12 14:22:45 -07:00
Yineng Zhang
812e82f35e fix: solve cu118 issue for cutlass mla (#5331) 2025-04-12 12:51:09 -07:00
Yineng Zhang
6f8593799b feat: add blackwell workflow (#5303) 2025-04-11 13:42:00 -07:00
Yineng Zhang
b75275b6f2 feat: add cu128 identifier for sgl-kernel (#5287) 2025-04-11 01:58:46 -07:00
saienduri
7f875f1293 update grok test (#5171) 2025-04-09 11:09:47 -07:00
saienduri
3033c11a21 Add dummy grok test to amd CI. (#5115) 2025-04-08 07:44:59 +00:00
Yineng Zhang
3289c1207d Update the retry count (#5051) 2025-04-03 17:07:38 -07:00
renxin
cccfc10e9c Feature/revise docs ci (#5009) 2025-04-02 20:08:56 -07:00
Yuhong Guo
87fafa0105 Revert PR 4764 & 4813 related to R1 RoPE (#4959) 2025-03-31 20:56:58 -07:00
Lianmin Zheng
f842853a40 Fix the timeout for unit-test-2-gpu in pr-test.yml (#4927) 2025-03-30 12:15:40 -07:00
Adarsh Shirawalmath
9fccda3111 [Feature] use pytest for sgl-kernel (#4896) 2025-03-30 10:36:52 -07:00