sglang

Author	SHA1	Message	Date
Chen Shengzhi	f1cf6eefbe	[Fix] Check the device backend before calling empty_cache function (#4212 )	2025-03-12 21:28:48 -07:00
Yineng Zhang	d1da58e275	unify is_cuda and is_hip (#4321 )	2025-03-11 18:12:56 -07:00
Lianmin Zheng	a3ab768a2b	Clean up custom allreduce (#4029 )	2025-03-03 04:59:53 -08:00
Hubert Lu	9cf4077294	Enable custom AR for AMD GPUs and maintain it in sgl-kernel (#3406 )	2025-03-02 15:19:06 -08:00
Ke Bao	00ce7e311c	Fix all gather torch compile (#3992 ) Co-authored-by: yizhang2077 <1109276519@qq.com>	2025-03-02 00:41:38 -08:00
Nicolas Castet	127998cc41	Fix allgather ops inside cuda graphs (#3709 )	2025-02-25 08:39:10 -08:00
Shenggui Li	c0bb9eb3b3	[improve] made timeout configurable (#3803 )	2025-02-25 00:26:08 -08:00
yigex	66283dbc0c	[Fix] Not skip NVML Check on AMD Platform (#3135 )	2025-01-25 21:33:51 -08:00
Lianmin Zheng	ea535dc574	Revert "disable custom allreduce on HIP" (#3067 )	2025-01-22 21:33:35 -08:00
Hui Liu	ddc2001fb0	disable custom allreduce on HIP (#3058 )	2025-01-22 13:57:22 -08:00
Lianmin Zheng	73401fd016	Sync distributed package from vllm 0.6.4.post1 (#3010 )	2025-01-20 04:57:14 -08:00
Lianmin Zheng	89cd923581	Roll back to use vllm custom allreduce (#3006 )	2025-01-20 04:03:15 -08:00
Chaitanya Sri Krishna Lolla	1a820e38a2	Remove dependency of pynvml on ROCm (#2995 )	2025-01-20 13:00:35 +08:00
yizhang2077	24cafe3177	add config to swtich from vllm custom allreduce to sgl_kernel custom allreduce (#2981 )	2025-01-19 22:30:38 +08:00
yizhang2077	767c9dec03	adapt custom allreduce for tensorrt llm (#2511 )	2025-01-16 04:57:35 +08:00
yizhang2077	d5b95cbb53	adapt vllm distributed module to sglang (#2244 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-12-01 15:54:52 +08:00

1 2

66 Commits