Commit Graph

66 Commits

Author SHA1 Message Date
Chen Shengzhi
f1cf6eefbe [Fix] Check the device backend before calling empty_cache function (#4212) 2025-03-12 21:28:48 -07:00
Yineng Zhang
d1da58e275 unify is_cuda and is_hip (#4321) 2025-03-11 18:12:56 -07:00
Lianmin Zheng
a3ab768a2b Clean up custom allreduce (#4029) 2025-03-03 04:59:53 -08:00
Hubert Lu
9cf4077294 Enable custom AR for AMD GPUs and maintain it in sgl-kernel (#3406) 2025-03-02 15:19:06 -08:00
Ke Bao
00ce7e311c Fix all gather torch compile (#3992)
Co-authored-by: yizhang2077 <1109276519@qq.com>
2025-03-02 00:41:38 -08:00
Nicolas Castet
127998cc41 Fix allgather ops inside cuda graphs (#3709) 2025-02-25 08:39:10 -08:00
Shenggui Li
c0bb9eb3b3 [improve] made timeout configurable (#3803) 2025-02-25 00:26:08 -08:00
yigex
66283dbc0c [Fix] Not skip NVML Check on AMD Platform (#3135) 2025-01-25 21:33:51 -08:00
Lianmin Zheng
ea535dc574 Revert "disable custom allreduce on HIP" (#3067) 2025-01-22 21:33:35 -08:00
Hui Liu
ddc2001fb0 disable custom allreduce on HIP (#3058) 2025-01-22 13:57:22 -08:00
Lianmin Zheng
73401fd016 Sync distributed package from vllm 0.6.4.post1 (#3010) 2025-01-20 04:57:14 -08:00
Lianmin Zheng
89cd923581 Roll back to use vllm custom allreduce (#3006) 2025-01-20 04:03:15 -08:00
Chaitanya Sri Krishna Lolla
1a820e38a2 Remove dependency of pynvml on ROCm (#2995) 2025-01-20 13:00:35 +08:00
yizhang2077
24cafe3177 add config to swtich from vllm custom allreduce to sgl_kernel custom allreduce (#2981) 2025-01-19 22:30:38 +08:00
yizhang2077
767c9dec03 adapt custom allreduce for tensorrt llm (#2511) 2025-01-16 04:57:35 +08:00
yizhang2077
d5b95cbb53 adapt vllm distributed module to sglang (#2244)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-12-01 15:54:52 +08:00