sglang

Author	SHA1	Message	Date
XinyuanTong	e88dd482ed	[CI]Add performance CI for VLM (#6038 ) Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>	2025-05-07 19:20:03 -07:00
Cheng Wan	9bddf1c82d	Deferring 8 GPU test (#6102 )	2025-05-07 18:49:58 -07:00
Lianmin Zheng	38053c3372	Fix the timeout for 8 gpu tests (#6084 )	2025-05-07 03:13:12 -07:00
Johnny	cb69194562	feat: add release workflow for SGLang kernels on aarch64 (#6010 ) Co-authored-by: Qiaolin-Yu <liin1211@outlook.com> Co-authored-by: Yineng Zhang <me@zhyncs.com>	2025-05-06 19:42:07 -07:00
Jinyan Chen	8a828666a3	Add DeepEP to CI PR Test (#5655 ) Co-authored-by: Jinyan Chen <jinyanc@nvidia.com>	2025-05-06 17:36:03 -07:00
Sai Enduri	73bc1d00fc	Add 1 gpu perf and 2 gpu accuracy tests for AMD MI300x CI. (#5960 )	2025-05-01 20:56:59 -07:00
Yineng Zhang	9a6ad8916d	chore: upgrade sgl-kernel 0.1.1 (#5933 )	2025-04-30 16:13:30 -07:00
Sai Enduri	2afba1b1c1	Add TP2 MOE benchmarks for AMD. (#5909 )	2025-04-30 11:38:20 -07:00
Baizhou Zhang	799789afed	Bump Flashinfer to 0.2.5 (#5870 ) Co-authored-by: Yuhao Chen <yxckeis8@gmail.com>	2025-04-29 19:50:57 -07:00
saienduri	e3a5304475	Add AMD MI300x Nightly Testing. (#5861 )	2025-04-29 17:34:32 -07:00
Yineng Zhang	f4c191a712	chore: update Dockerfile (#5894 )	2025-04-29 12:55:13 -07:00
HAI	d364b9b0f2	ROCm: update AITER (#5816 )	2025-04-28 11:01:20 -07:00
Lianmin Zheng	849c83a0c0	[CI] test chunked prefill more (#5798 )	2025-04-28 10:57:17 -07:00
Lianmin Zheng	daed453e84	[CI] Improve github summary & enable fa3 for more models (#5796 )	2025-04-27 15:29:46 -07:00
Lianmin Zheng	ded04b2e0a	Update nightly-test.yml (#5797 )	2025-04-27 15:27:24 -07:00
Baizhou Zhang	f9fb33efc3	Add 8-GPU Test for Deepseek-V3 (#5691 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2025-04-27 12:46:12 -07:00
Lianmin Zheng	35ca04d2fa	[CI] fix port conflicts (#5789 )	2025-04-27 05:17:44 -07:00
Stefan He	408ba02218	Add Llama 4 to FA3 test (#5509 )	2025-04-26 19:49:31 -07:00
saienduri	c5e1026f47	Update amd docker image to `sglang:v0.4.5.post3-rocm630`. (#5697 )	2025-04-26 18:46:57 -07:00
Yineng Zhang	127ff8982e	fix torchvision::nms not exist (#5671 )	2025-04-23 02:17:21 -07:00
Ke Bao	11b23ae97b	Remove extra copy in deepseek forward absorb (#5578 ) Co-authored-by: saienduri <saimanas.enduri@amd.com>	2025-04-21 19:33:21 -07:00
lukec	417b44eba8	[Feat] upgrade pytorch2.6 (#5417 )	2025-04-20 16:06:34 -07:00
Yineng Zhang	0961feefca	feat: use flashinfer jit package (#5547 )	2025-04-19 00:28:39 -07:00
Yineng Zhang	88defc4d89	fix: solve release issue (#5434 )	2025-04-15 12:58:11 -07:00
Lianmin Zheng	838fa0f218	[minor] cleanup cmakelists.txt (#5420 )	2025-04-15 07:07:07 -07:00
Yineng Zhang	11421a3f44	fix: update pr-test-sgl-kernel (#5399 )	2025-04-14 21:14:59 -07:00
yhyang201	072df75354	Support for Qwen2.5-VL Model in bitsandbytes Format (#5003 )	2025-04-14 02:03:40 -07:00
Yineng Zhang	b62e7e99b8	feat: adapt merge_state (#5337 )	2025-04-12 21:14:04 -07:00
Yineng Zhang	75015bb688	ci: update release node (#5333 )	2025-04-12 14:22:45 -07:00
Yineng Zhang	812e82f35e	fix: solve cu118 issue for cutlass mla (#5331 )	2025-04-12 12:51:09 -07:00
Yineng Zhang	6f8593799b	feat: add blackwell workflow (#5303 )	2025-04-11 13:42:00 -07:00
Yineng Zhang	b75275b6f2	feat: add cu128 identifier for sgl-kernel (#5287 )	2025-04-11 01:58:46 -07:00
saienduri	7f875f1293	update grok test (#5171 )	2025-04-09 11:09:47 -07:00
saienduri	3033c11a21	Add dummy grok test to amd CI. (#5115 )	2025-04-08 07:44:59 +00:00
Yineng Zhang	3289c1207d	Update the retry count (#5051 )	2025-04-03 17:07:38 -07:00
renxin	cccfc10e9c	Feature/revise docs ci (#5009 )	2025-04-02 20:08:56 -07:00
Yuhong Guo	87fafa0105	Revert PR 4764 & 4813 related to R1 RoPE (#4959 )	2025-03-31 20:56:58 -07:00
Lianmin Zheng	f842853a40	Fix the timeout for unit-test-2-gpu in pr-test.yml (#4927 )	2025-03-30 12:15:40 -07:00
Adarsh Shirawalmath	9fccda3111	[Feature] use pytest for sgl-kernel (#4896 )	2025-03-30 10:36:52 -07:00
Lianmin Zheng	4ede6770cd	Fix retract for page size > 1 (#4914 )	2025-03-30 02:57:15 -07:00
Yineng Zhang	72549263c6	update sgl-kernel test ci (#4866 )	2025-03-28 11:42:41 -07:00
Lianmin Zheng	74e0ac1dbd	Clean up `import vllm` in quantization/__init__.py (#4834 )	2025-03-28 10:34:10 -07:00
warjiang	18317ddc13	ci: add condition for daily docker build (#4487 )	2025-03-27 21:44:37 -07:00
fzyzcjy	0d3e3072ee	Fix CI of test_patch_torch (#4844 )	2025-03-27 21:22:45 -07:00
Yineng Zhang	5fa3058f01	fix the release doc dependency issue (#4828 )	2025-03-27 13:28:12 -07:00
strgrb	668ecc6c5b	Fix ut mla-test-1-gpu-amd (#4813 ) Co-authored-by: Zhang Kaihong <zhangkaihong.zkh@alibaba-inc.com>	2025-03-27 08:27:51 -07:00
Yineng Zhang	8bf6d7f406	support cmake for sgl-kernel (#4706 ) Co-authored-by: hebiao064 <hebiaobuaa@gmail.com> Co-authored-by: yinfan98 <1106310035@qq.com>	2025-03-27 01:42:28 -07:00
Xiaoyu Zhang	04e3ff6975	Support compressed tensors fp8w8a8 (#4743 )	2025-03-26 13:21:25 -07:00
fzyzcjy	26f07294f1	Warn users when release_memory_occupation is called without memory saver enabled (#4566 )	2025-03-26 00:18:14 -07:00
fzyzcjy	15ddd84322	Add retry for flaky tests in CI (#4755 )	2025-03-25 16:53:12 -07:00

1 2 3 4 5 ...

307 Commits