Commit Graph

38 Commits

Author SHA1 Message Date
kk
7a5e6ce1cb Fix GPU OOM (#6564)
Co-authored-by: michael <michael.zhang@amd.com>
2025-05-24 16:38:39 -07:00
Sai Enduri
24c035f2e3 Temporarily disable MI325x 8 gpu testing. (#6576) 2025-05-24 16:37:22 -07:00
HAI
5c0b38f369 aiter attention-backend (default enabled on AMD/ROCm) (#6381) 2025-05-20 22:52:41 -07:00
Sai Enduri
c47a51db7e Clean up AMD CI (#6365) 2025-05-18 01:17:28 -07:00
Lianmin Zheng
e07a6977e7 Minor improvements of TokenizerManager / health check (#6327) 2025-05-15 15:29:25 -07:00
Sai Enduri
73eb67c087 Enable unit tests for AMD CI. (#6283) 2025-05-14 12:55:36 -07:00
Sai Enduri
0f5cb8cae1 Enable MI325X AMD CI. (#6259) 2025-05-13 01:49:33 -07:00
Sai Enduri
7d3a3d4510 Update AMD CI docker to v0.4.6.post3-rocm630. (#6213) 2025-05-12 00:00:46 -07:00
Sai Enduri
73bc1d00fc Add 1 gpu perf and 2 gpu accuracy tests for AMD MI300x CI. (#5960) 2025-05-01 20:56:59 -07:00
Sai Enduri
2afba1b1c1 Add TP2 MOE benchmarks for AMD. (#5909) 2025-04-30 11:38:20 -07:00
HAI
d364b9b0f2 ROCm: update AITER (#5816) 2025-04-28 11:01:20 -07:00
saienduri
c5e1026f47 Update amd docker image to sglang:v0.4.5.post3-rocm630. (#5697) 2025-04-26 18:46:57 -07:00
Ke Bao
11b23ae97b Remove extra copy in deepseek forward absorb (#5578)
Co-authored-by: saienduri <saimanas.enduri@amd.com>
2025-04-21 19:33:21 -07:00
saienduri
7f875f1293 update grok test (#5171) 2025-04-09 11:09:47 -07:00
saienduri
3033c11a21 Add dummy grok test to amd CI. (#5115) 2025-04-08 07:44:59 +00:00
Yuhong Guo
87fafa0105 Revert PR 4764 & 4813 related to R1 RoPE (#4959) 2025-03-31 20:56:58 -07:00
strgrb
668ecc6c5b Fix ut mla-test-1-gpu-amd (#4813)
Co-authored-by: Zhang Kaihong <zhangkaihong.zkh@alibaba-inc.com>
2025-03-27 08:27:51 -07:00
Yineng Zhang
8bf6d7f406 support cmake for sgl-kernel (#4706)
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
2025-03-27 01:42:28 -07:00
fzyzcjy
26f07294f1 Warn users when release_memory_occupation is called without memory saver enabled (#4566) 2025-03-26 00:18:14 -07:00
Lianmin Zheng
82dec1f70b Remove redundant type conversion (#4513) 2025-03-17 05:57:35 -07:00
Lianmin Zheng
c30976fb41 Fix finish step for pr tests and notebook tests (#4467) 2025-03-16 00:52:06 -07:00
Yineng Zhang
ad1ae7f7cd use topk_softmax with sgl-kernel (#4439) 2025-03-14 15:59:06 -07:00
Yineng Zhang
977d7cd26a cleanup deps 1/n (#4400)
Co-authored-by: sleepcoo <sleepcoo@gmail.com>
2025-03-14 00:00:33 -07:00
HandH1998
2ac189edc8 Amd test fp8 (#4261) 2025-03-10 10:12:09 -07:00
Lianmin Zheng
e8a69e4d0c Clean up fp8 support (#4230) 2025-03-09 21:46:35 -07:00
Lianmin Zheng
48473684cc Split test_mla.py into two files (#4216) 2025-03-08 15:40:49 -08:00
saienduri
e1aaa79ac9 Update amd ci docker image to v0.4.3.post4-rocm630. (#4189) 2025-03-07 13:02:02 -08:00
Lianmin Zheng
d7934cde45 Fix CI and install docs (#3821) 2025-02-24 16:17:38 -08:00
Yineng Zhang
07ab4d4a2d fix #3654 2025-02-18 15:16:16 +08:00
saienduri
522e18eaeb Update amd docker image. (#3654) 2025-02-17 20:12:55 -08:00
saienduri
7474bed883 Update to latest amd image. (#3597) 2025-02-17 00:29:47 +08:00
Yineng Zhang
4fe92bfca5 fix mla test (#3469) 2025-02-10 21:12:00 +08:00
Yineng Zhang
2b1808cec4 update unit test in AMD CI (#3366) 2025-02-07 17:25:16 +08:00
saienduri
200d3b1608 Add sgl-kernel to MI300 CI paths tested. (#3335)
Co-authored-by: HAI <hixiao@gmail.com>
2025-02-06 00:45:38 -08:00
saienduri
2d9c319594 Docker switch (#3327)
Co-authored-by: HAI <hixiao@gmail.com>
2025-02-05 18:06:50 -08:00
saienduri
04d8cd2088 Initial Enablement of CI on MI300 (#3168) 2025-02-05 10:45:12 -08:00
Lianmin Zheng
b6cd903604 Update readme and workflow (#1716) 2024-10-19 13:01:44 -07:00
Lianmin Zheng
bc068e9618 [CI] Move AMD test to a separate file (#1500) 2024-09-24 02:06:28 -07:00