XinyuanTong
|
e88dd482ed
|
[CI]Add performance CI for VLM (#6038)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
|
2025-05-07 19:20:03 -07:00 |
|
Cheng Wan
|
9bddf1c82d
|
Deferring 8 GPU test (#6102)
|
2025-05-07 18:49:58 -07:00 |
|
Lianmin Zheng
|
38053c3372
|
Fix the timeout for 8 gpu tests (#6084)
|
2025-05-07 03:13:12 -07:00 |
|
Johnny
|
cb69194562
|
feat: add release workflow for SGLang kernels on aarch64 (#6010)
Co-authored-by: Qiaolin-Yu <liin1211@outlook.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2025-05-06 19:42:07 -07:00 |
|
Jinyan Chen
|
8a828666a3
|
Add DeepEP to CI PR Test (#5655)
Co-authored-by: Jinyan Chen <jinyanc@nvidia.com>
|
2025-05-06 17:36:03 -07:00 |
|
Sai Enduri
|
73bc1d00fc
|
Add 1 gpu perf and 2 gpu accuracy tests for AMD MI300x CI. (#5960)
|
2025-05-01 20:56:59 -07:00 |
|
Yineng Zhang
|
9a6ad8916d
|
chore: upgrade sgl-kernel 0.1.1 (#5933)
|
2025-04-30 16:13:30 -07:00 |
|
Sai Enduri
|
2afba1b1c1
|
Add TP2 MOE benchmarks for AMD. (#5909)
|
2025-04-30 11:38:20 -07:00 |
|
Baizhou Zhang
|
799789afed
|
Bump Flashinfer to 0.2.5 (#5870)
Co-authored-by: Yuhao Chen <yxckeis8@gmail.com>
|
2025-04-29 19:50:57 -07:00 |
|
saienduri
|
e3a5304475
|
Add AMD MI300x Nightly Testing. (#5861)
|
2025-04-29 17:34:32 -07:00 |
|
Yineng Zhang
|
f4c191a712
|
chore: update Dockerfile (#5894)
|
2025-04-29 12:55:13 -07:00 |
|
HAI
|
d364b9b0f2
|
ROCm: update AITER (#5816)
|
2025-04-28 11:01:20 -07:00 |
|
Lianmin Zheng
|
849c83a0c0
|
[CI] test chunked prefill more (#5798)
|
2025-04-28 10:57:17 -07:00 |
|
Lianmin Zheng
|
daed453e84
|
[CI] Improve github summary & enable fa3 for more models (#5796)
|
2025-04-27 15:29:46 -07:00 |
|
Lianmin Zheng
|
ded04b2e0a
|
Update nightly-test.yml (#5797)
|
2025-04-27 15:27:24 -07:00 |
|
Baizhou Zhang
|
f9fb33efc3
|
Add 8-GPU Test for Deepseek-V3 (#5691)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2025-04-27 12:46:12 -07:00 |
|
Lianmin Zheng
|
35ca04d2fa
|
[CI] fix port conflicts (#5789)
|
2025-04-27 05:17:44 -07:00 |
|
Stefan He
|
408ba02218
|
Add Llama 4 to FA3 test (#5509)
|
2025-04-26 19:49:31 -07:00 |
|
saienduri
|
c5e1026f47
|
Update amd docker image to sglang:v0.4.5.post3-rocm630. (#5697)
|
2025-04-26 18:46:57 -07:00 |
|
Yineng Zhang
|
127ff8982e
|
fix torchvision::nms not exist (#5671)
|
2025-04-23 02:17:21 -07:00 |
|
Ke Bao
|
11b23ae97b
|
Remove extra copy in deepseek forward absorb (#5578)
Co-authored-by: saienduri <saimanas.enduri@amd.com>
|
2025-04-21 19:33:21 -07:00 |
|
lukec
|
417b44eba8
|
[Feat] upgrade pytorch2.6 (#5417)
|
2025-04-20 16:06:34 -07:00 |
|
Yineng Zhang
|
0961feefca
|
feat: use flashinfer jit package (#5547)
|
2025-04-19 00:28:39 -07:00 |
|
Yineng Zhang
|
88defc4d89
|
fix: solve release issue (#5434)
|
2025-04-15 12:58:11 -07:00 |
|
Lianmin Zheng
|
838fa0f218
|
[minor] cleanup cmakelists.txt (#5420)
|
2025-04-15 07:07:07 -07:00 |
|
Yineng Zhang
|
11421a3f44
|
fix: update pr-test-sgl-kernel (#5399)
|
2025-04-14 21:14:59 -07:00 |
|
yhyang201
|
072df75354
|
Support for Qwen2.5-VL Model in bitsandbytes Format (#5003)
|
2025-04-14 02:03:40 -07:00 |
|
Yineng Zhang
|
b62e7e99b8
|
feat: adapt merge_state (#5337)
|
2025-04-12 21:14:04 -07:00 |
|
Yineng Zhang
|
75015bb688
|
ci: update release node (#5333)
|
2025-04-12 14:22:45 -07:00 |
|
Yineng Zhang
|
812e82f35e
|
fix: solve cu118 issue for cutlass mla (#5331)
|
2025-04-12 12:51:09 -07:00 |
|
Yineng Zhang
|
6f8593799b
|
feat: add blackwell workflow (#5303)
|
2025-04-11 13:42:00 -07:00 |
|
Yineng Zhang
|
b75275b6f2
|
feat: add cu128 identifier for sgl-kernel (#5287)
|
2025-04-11 01:58:46 -07:00 |
|
saienduri
|
7f875f1293
|
update grok test (#5171)
|
2025-04-09 11:09:47 -07:00 |
|
saienduri
|
3033c11a21
|
Add dummy grok test to amd CI. (#5115)
|
2025-04-08 07:44:59 +00:00 |
|
Yineng Zhang
|
3289c1207d
|
Update the retry count (#5051)
|
2025-04-03 17:07:38 -07:00 |
|
renxin
|
cccfc10e9c
|
Feature/revise docs ci (#5009)
|
2025-04-02 20:08:56 -07:00 |
|
Yuhong Guo
|
87fafa0105
|
Revert PR 4764 & 4813 related to R1 RoPE (#4959)
|
2025-03-31 20:56:58 -07:00 |
|
Lianmin Zheng
|
f842853a40
|
Fix the timeout for unit-test-2-gpu in pr-test.yml (#4927)
|
2025-03-30 12:15:40 -07:00 |
|
Adarsh Shirawalmath
|
9fccda3111
|
[Feature] use pytest for sgl-kernel (#4896)
|
2025-03-30 10:36:52 -07:00 |
|
Lianmin Zheng
|
4ede6770cd
|
Fix retract for page size > 1 (#4914)
|
2025-03-30 02:57:15 -07:00 |
|
Yineng Zhang
|
72549263c6
|
update sgl-kernel test ci (#4866)
|
2025-03-28 11:42:41 -07:00 |
|
Lianmin Zheng
|
74e0ac1dbd
|
Clean up import vllm in quantization/__init__.py (#4834)
|
2025-03-28 10:34:10 -07:00 |
|
warjiang
|
18317ddc13
|
ci: add condition for daily docker build (#4487)
|
2025-03-27 21:44:37 -07:00 |
|
fzyzcjy
|
0d3e3072ee
|
Fix CI of test_patch_torch (#4844)
|
2025-03-27 21:22:45 -07:00 |
|
Yineng Zhang
|
5fa3058f01
|
fix the release doc dependency issue (#4828)
|
2025-03-27 13:28:12 -07:00 |
|
strgrb
|
668ecc6c5b
|
Fix ut mla-test-1-gpu-amd (#4813)
Co-authored-by: Zhang Kaihong <zhangkaihong.zkh@alibaba-inc.com>
|
2025-03-27 08:27:51 -07:00 |
|
Yineng Zhang
|
8bf6d7f406
|
support cmake for sgl-kernel (#4706)
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
Co-authored-by: yinfan98 <1106310035@qq.com>
|
2025-03-27 01:42:28 -07:00 |
|
Xiaoyu Zhang
|
04e3ff6975
|
Support compressed tensors fp8w8a8 (#4743)
|
2025-03-26 13:21:25 -07:00 |
|
fzyzcjy
|
26f07294f1
|
Warn users when release_memory_occupation is called without memory saver enabled (#4566)
|
2025-03-26 00:18:14 -07:00 |
|
fzyzcjy
|
15ddd84322
|
Add retry for flaky tests in CI (#4755)
|
2025-03-25 16:53:12 -07:00 |
|