sglang

Author	SHA1	Message	Date
Lianmin Zheng	a17e70f5cc	Use more general heuristics to set the default value of --mem-fraction-static (#10975 ) Co-authored-by: sglang-bot <sglangbot@gmail.com>	2025-09-29 10:11:03 -07:00
Xiaoyu Zhang	11965b0daf	Fix sgl-kernel benchmark dead code (#11022 )	2025-09-29 15:06:40 +08:00
Kangyan-Zhou	0c9174108a	Unify SGL Kernel Releases (#10701 )	2025-09-28 19:48:28 -07:00
Xiaoyu Zhang	05a3526654	Restruct gpu_memory_settings in a unify function and relax max_cuda_graph_bs (#10372 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: sglang-bot <sglangbot@gmail.com>	2025-09-26 15:10:49 -07:00
Mick	fff7fbabe6	ci: fix rate-limit of huggingface with hf auth login (#10947 )	2025-09-26 11:02:44 -07:00
Lianmin Zheng	b1f0fc1c0b	Add CI timeout guidelines (#10829 )	2025-09-23 22:08:02 -07:00
Shangming Cai	23632d350c	Fix latest main ci (#10799 ) Signed-off-by: Shangming Cai <csmthu@gmail.com>	2025-09-23 12:46:13 -07:00
Yineng Zhang	ba94b82986	fix: update run_suite (#10685 )	2025-09-20 01:22:06 -07:00
fzyzcjy	ae4be601c2	Fix CI when sgl-kernel is changed but srt is not changed (#10515 )	2025-09-16 02:49:54 -07:00
Lianmin Zheng	50dc0c1e9c	Run tests based on labels (#10456 )	2025-09-15 00:29:20 -07:00
Yineng Zhang	7ce6c10eb6	fix: enable cu124 and cu128 build on main push (#10431 )	2025-09-14 16:19:35 -07:00
fzyzcjy	e3cf812f7d	Fix sgl-kernel + srt CI (#10419 )	2025-09-14 01:44:47 -07:00
fzyzcjy	a0f844ed5a	Let sgl-kernel changes be tested on srt (#10313 )	2025-09-14 01:09:17 -07:00
Yineng Zhang	9d775b1a2d	feat: add deepseek v3 fp4 ut (#10391 )	2025-09-12 15:43:29 -07:00
hzh0425	1a3d6f31da	Modify ci workflow for auto-partitioning in 2-GPU backend tests (#10029 )	2025-09-06 10:28:42 +08:00
Lianmin Zheng	05e4787243	[CI] Fix the trigger condition for PR test workflows (#9761 )	2025-08-30 15:47:10 -07:00
Lianmin Zheng	2c7f01bc89	Reorganize CI and test files (#9027 )	2025-08-10 12:30:06 -07:00
Lianmin Zheng	ef48d5547e	Fix CI (#9013 )	2025-08-09 16:00:10 -07:00
Lianmin Zheng	706bd69cc5	Clean up server_args.py to have a dedicated function for model specific adjustments (#8983 )	2025-08-08 19:56:50 -07:00
fzyzcjy	b114a8105b	Support B200 in CI (#8861 )	2025-08-06 21:42:44 +08:00
Lianmin Zheng	263c9236a0	Always trigger pr-test (#8527 )	2025-07-29 04:05:19 -07:00
Lianmin Zheng	69712e6f55	Rename the last step in pr-test.yml as pr-test-finish (#8486 )	2025-07-28 19:06:13 -07:00
Lifu Huang	5c705b1dce	Add perf tests for LoRA (#8314 )	2025-07-26 14:55:22 -07:00
Lianmin Zheng	9c7a46180c	[Doc] Steps to add a new attention backend (#8155 )	2025-07-18 16:38:26 -07:00
Cheng Wan	02404a1e35	[ci] recover 8-gpu deepep test (#8105 )	2025-07-17 00:46:40 -07:00
Cheng Wan	475a249bb8	temporarily disable deepep-8-gpu and activate two small tests (#7961 )	2025-07-11 14:22:05 -07:00
Cheng Wan	d487555f84	[CI] Add deepep tests to CI (#7872 )	2025-07-09 01:49:47 -07:00
Lianmin Zheng	55e03b10c4	Fix a bug in BatchTokenIDOut & Misc style and dependency updates (#7457 )	2025-06-23 06:20:39 -07:00
Junrong Lin	2103b80607	[CI] update verlengine ci to 4-gpu test (#6007 )	2025-05-27 14:32:23 -07:00
Shenggui Li	3f23d8cdf1	added support for tied weights in qwen pipeline parallelism (#6546 )	2025-05-25 00:00:56 -07:00
fzyzcjy	f11481b921	Add 4-GPU runner tests and split existing tests (#6383 )	2025-05-18 11:56:51 -07:00
Ying Sheng	bad7c26fdc	[PP] Fix init_memory_pool desync & add PP for mixtral (#6223 )	2025-05-12 12:38:09 -07:00
Lianmin Zheng	03227c5fa6	[CI] Reorganize the 8 gpu tests (#6192 )	2025-05-11 10:55:06 -07:00
Lianmin Zheng	17c36c5511	[CI] Disabled deepep tests temporarily because it takes too much time. (#6186 )	2025-05-10 23:40:50 -07:00
shangmingc	31d1f6e7f4	[PD] Add simple unit test for disaggregation feature (#5654 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-05-11 13:35:27 +08:00
Lianmin Zheng	de167cf5fa	Fix request abortion (#6184 )	2025-05-10 21:54:46 -07:00
XinyuanTong	e88dd482ed	[CI]Add performance CI for VLM (#6038 ) Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>	2025-05-07 19:20:03 -07:00
Cheng Wan	9bddf1c82d	Deferring 8 GPU test (#6102 )	2025-05-07 18:49:58 -07:00
Lianmin Zheng	38053c3372	Fix the timeout for 8 gpu tests (#6084 )	2025-05-07 03:13:12 -07:00
Jinyan Chen	8a828666a3	Add DeepEP to CI PR Test (#5655 ) Co-authored-by: Jinyan Chen <jinyanc@nvidia.com>	2025-05-06 17:36:03 -07:00
Baizhou Zhang	799789afed	Bump Flashinfer to 0.2.5 (#5870 ) Co-authored-by: Yuhao Chen <yxckeis8@gmail.com>	2025-04-29 19:50:57 -07:00
Lianmin Zheng	849c83a0c0	[CI] test chunked prefill more (#5798 )	2025-04-28 10:57:17 -07:00
Lianmin Zheng	daed453e84	[CI] Improve github summary & enable fa3 for more models (#5796 )	2025-04-27 15:29:46 -07:00
Baizhou Zhang	f9fb33efc3	Add 8-GPU Test for Deepseek-V3 (#5691 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2025-04-27 12:46:12 -07:00
Lianmin Zheng	35ca04d2fa	[CI] fix port conflicts (#5789 )	2025-04-27 05:17:44 -07:00
Stefan He	408ba02218	Add Llama 4 to FA3 test (#5509 )	2025-04-26 19:49:31 -07:00
Yineng Zhang	0961feefca	feat: use flashinfer jit package (#5547 )	2025-04-19 00:28:39 -07:00
Lianmin Zheng	838fa0f218	[minor] cleanup cmakelists.txt (#5420 )	2025-04-15 07:07:07 -07:00
Yineng Zhang	3289c1207d	Update the retry count (#5051 )	2025-04-03 17:07:38 -07:00
Lianmin Zheng	f842853a40	Fix the timeout for unit-test-2-gpu in pr-test.yml (#4927 )	2025-03-30 12:15:40 -07:00

1 2 3

140 Commits