Even Zhou
|
5b64f006ec
|
[Feature] Support DeepEP normal & Redundant Experts on NPU (#9881)
|
2025-09-10 20:35:26 -07:00 |
|
Hubert Lu
|
91b3555d2d
|
Add tests to AMD CI for MI35x (#9662)
Co-authored-by: Sai Enduri <saimanas.enduri@amd.com>
|
2025-09-10 12:50:05 -07:00 |
|
Lzhang-hub
|
4efe2c57c9
|
support vlm model spec bench (#10173)
|
2025-09-10 13:37:04 +08:00 |
|
Lianmin Zheng
|
bcf1955f7e
|
Revert "chore: upgrade v0.3.9 sgl-kernel" (#10245)
|
2025-09-09 19:05:20 -07:00 |
|
Yineng Zhang
|
d3ee70985f
|
chore: upgrade v0.3.9 sgl-kernel (#10220)
|
2025-09-09 03:16:25 -07:00 |
|
Liangsheng Yin
|
6e95f5e5bd
|
Simplify Router arguments passing and build it in docker image (#9964)
|
2025-09-05 12:13:55 +08:00 |
|
Yineng Zhang
|
de9217334b
|
feat: add gpt oss b200 ci (#9988)
|
2025-09-03 17:26:38 -07:00 |
|
Lianmin Zheng
|
646076b71e
|
Update guidelines for syncing code between repos (#9831)
|
2025-08-30 16:10:35 -07:00 |
|
Lianmin Zheng
|
0d04008936
|
[CI] Code sync tools (#9830)
|
2025-08-30 16:02:29 -07:00 |
|
Chayenne
|
9b08d975a0
|
[docs] Refactor, remove compiled results and add gpt-oss (#9613)
Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com>
|
2025-08-25 15:27:06 -07:00 |
|
Chang Su
|
7638f5e44e
|
[router] Implement gRPC SGLangSchedulerClient (#9364)
|
2025-08-19 16:44:11 -07:00 |
|
Lianmin Zheng
|
c480a3f6ea
|
Minor style fixes for sgl-kernel (#9289)
|
2025-08-18 09:38:35 -07:00 |
|
michael-amd
|
0fc8bf2cd4
|
[AMD] Update fallback images for AMD CI (#9159)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-08-13 20:15:10 -07:00 |
|
li chaoran
|
2ecbd8b8bf
|
[feat] add ascend readme and docker release (#8700)
Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com>
Signed-off-by: lichaoran <pkwarcraft@gmail.com>
Co-authored-by: Even Zhou <even.y.zhou@outlook.com>
Co-authored-by: ronnie_zheng <zl19940307@163.com>
|
2025-08-12 13:25:42 -07:00 |
|
Yi Zhang
|
89f1d4f536
|
update deepep commit to support qwen3-coder (#9066)
|
2025-08-11 10:42:33 -07:00 |
|
Cheng Wan
|
f003cd3548
|
[CI] Fix CI tests (#9050)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-08-10 23:52:05 -07:00 |
|
Lianmin Zheng
|
2c7f01bc89
|
Reorganize CI and test files (#9027)
|
2025-08-10 12:30:06 -07:00 |
|
Lianmin Zheng
|
706bd69cc5
|
Clean up server_args.py to have a dedicated function for model specific adjustments (#8983)
|
2025-08-08 19:56:50 -07:00 |
|
michael-amd
|
23f2afb2ce
|
[AMD] Update SGLang image fallback logic for AMD CI (#8980)
|
2025-08-08 18:51:29 -07:00 |
|
fzyzcjy
|
482c3db29f
|
Fix sgl-kernel arch and missing package in CI (#8869)
|
2025-08-07 02:08:15 -07:00 |
|
michael-amd
|
4f2e1490c3
|
[AMD] Pull latest SGLang version for AMD CI (#8787)
|
2025-08-06 20:20:26 -07:00 |
|
Yineng Zhang
|
cbbd685a46
|
chore: use torch 2.8 stable (#8880)
|
2025-08-06 15:51:40 -07:00 |
|
Cheng Wan
|
78aad91037
|
[CI] fix pip upgrade (#8881)
|
2025-08-06 15:02:32 -07:00 |
|
fzyzcjy
|
b114a8105b
|
Support B200 in CI (#8861)
|
2025-08-06 21:42:44 +08:00 |
|
Yineng Zhang
|
3ae8e3ea8f
|
chore: upgrade torch 2.8.0 (#8836)
|
2025-08-05 17:32:01 -07:00 |
|
kk
|
32d9e39a29
|
Fix potential memory fault issue and ncclSystemError in CI test (#8681)
Co-authored-by: wunhuang <wunhuang@amd.com>
|
2025-08-05 12:19:37 -07:00 |
|
Even Zhou
|
fee0ab0fba
|
[CI] Ascend NPU CI enhancement (#8294)
Co-authored-by: ronnie_zheng <zl19940307@163.com>
|
2025-08-03 22:16:38 -07:00 |
|
li chaoran
|
fe5086fd8b
|
chore: speedup NPU CI by cache (#8270)
Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com>
Co-authored-by: ronnie_zheng <zl19940307@163.com>
|
2025-07-31 17:29:50 -07:00 |
|
Keyang Ru
|
7c9697178e
|
[CI]Add genai-bench Performance Validation for PD Router (#8477)
Co-authored-by: key4ng <rukeyang@gamil.com>
|
2025-07-28 16:58:23 -07:00 |
|
Shangming Cai
|
70e37b97bf
|
chore: upgrade mooncake 0.3.5 (#8341)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-07-25 01:17:26 -07:00 |
|
michael-amd
|
0e5fa67773
|
[AMD] Pull latest image for AMD CI (#8070)
|
2025-07-23 17:56:14 -07:00 |
|
ronnie_zheng
|
93d124ef5a
|
[feature] enable NPU CI (#7935)
Co-authored-by: Even Zhou <14368888+iforgetmyname@users.noreply.github.com>
|
2025-07-20 13:12:42 -07:00 |
|
Simo Lin
|
c8f31042a8
|
[router] Refactor router and policy traits with dependency injection (#7987)
Co-authored-by: Jin Pan <jpan236@wisc.edu>
Co-authored-by: Keru Yang <rukeyang@gmail.com>
Co-authored-by: Yingyi Huang <yingyihuang2000@outlook.com>
Co-authored-by: Philip Zhu <phlipzhux@gmail.com>
|
2025-07-18 14:24:24 -07:00 |
|
Cheng Wan
|
02404a1e35
|
[ci] recover 8-gpu deepep test (#8105)
|
2025-07-17 00:46:40 -07:00 |
|
Sai Enduri
|
f06bd210c0
|
Update amd docker image. (#8045)
Co-authored-by: Hubert Lu <55214931+hubertlu-tw@users.noreply.github.com>
|
2025-07-15 15:09:56 -07:00 |
|
Hank Han
|
2117f82def
|
[ci] CI supports use cached models (#7874)
|
2025-07-14 11:42:21 +00:00 |
|
Cheng Wan
|
d487555f84
|
[CI] Add deepep tests to CI (#7872)
|
2025-07-09 01:49:47 -07:00 |
|
Kay Yan
|
975a5ec69c
|
[fix] update bench_speculative.py for compatibility (#7764)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
|
2025-07-04 16:32:54 +08:00 |
|
Lianmin Zheng
|
22352d47a9
|
Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632)
Co-authored-by: Kan Wu <wukanustc@gmail.com>
|
2025-06-29 23:16:19 -07:00 |
|
Hubert Lu
|
3b3f1e3aeb
|
[AMD] Add unit-test-sgl-kernel-amd to AMD CI (#7539)
|
2025-06-29 15:50:09 -07:00 |
|
Keyang Ru
|
29bd4c8135
|
[CI] Add CI Testing for Prefill-Decode Disaggregation with Router (#7540)
|
2025-06-27 00:18:56 -07:00 |
|
Mick
|
4d67025a1d
|
chore: improve ci bug reporting (#7542)
|
2025-06-26 01:32:44 -07:00 |
|
Shangming Cai
|
a07f8ae4b7
|
[CI] Upgrade mooncake to v0.3.4.post2 to fix potential slice failed bug (#7522)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-06-25 01:49:22 -07:00 |
|
Shangming Cai
|
d6dddc19ff
|
[CI] Upgrade mooncake to 0.3.4.post1 to fix 8 gpu tests (#7472)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-06-24 02:10:50 +08:00 |
|
kk
|
bd4f581896
|
Fix torch compile run (#7391)
Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: Sai Enduri <saimanas.enduri@amd.com>
|
2025-06-22 15:33:09 -07:00 |
|
Shangming Cai
|
187b85b7f3
|
[PD] Optimize custom mem pool usage and bump mooncake version (#7393)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-06-20 09:50:39 -07:00 |
|
Lianmin Zheng
|
0f218731e3
|
Do not run frontend_reasoning.ipynb to reduce the CI load (#7073)
|
2025-06-10 17:15:31 -07:00 |
|
Yineng Zhang
|
56ccd3c22c
|
chore: upgrade flashinfer v0.2.6.post1 jit (#6958)
Co-authored-by: alcanderian <alcanderian@gmail.com>
Co-authored-by: Qiaolin Yu <qy254@cornell.edu>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: ispobock <ispobaoke@gmail.com>
|
2025-06-09 09:22:39 -07:00 |
|
Hubert Lu
|
4740288303
|
[AMD] Add more tests to per-commit-amd (#6926)
|
2025-06-08 01:08:37 -07:00 |
|
HAI
|
b819381fec
|
AITER backend extension and workload optimizations (#6838)
Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>
|
2025-06-05 23:00:18 -07:00 |
|