Chen Shengzhi
|
f1cf6eefbe
|
[Fix] Check the device backend before calling empty_cache function (#4212)
|
2025-03-12 21:28:48 -07:00 |
|
Yineng Zhang
|
d1da58e275
|
unify is_cuda and is_hip (#4321)
|
2025-03-11 18:12:56 -07:00 |
|
Lianmin Zheng
|
a3ab768a2b
|
Clean up custom allreduce (#4029)
|
2025-03-03 04:59:53 -08:00 |
|
Hubert Lu
|
9cf4077294
|
Enable custom AR for AMD GPUs and maintain it in sgl-kernel (#3406)
|
2025-03-02 15:19:06 -08:00 |
|
Ke Bao
|
00ce7e311c
|
Fix all gather torch compile (#3992)
Co-authored-by: yizhang2077 <1109276519@qq.com>
|
2025-03-02 00:41:38 -08:00 |
|
Nicolas Castet
|
127998cc41
|
Fix allgather ops inside cuda graphs (#3709)
|
2025-02-25 08:39:10 -08:00 |
|
Shenggui Li
|
c0bb9eb3b3
|
[improve] made timeout configurable (#3803)
|
2025-02-25 00:26:08 -08:00 |
|
yigex
|
66283dbc0c
|
[Fix] Not skip NVML Check on AMD Platform (#3135)
|
2025-01-25 21:33:51 -08:00 |
|
Lianmin Zheng
|
ea535dc574
|
Revert "disable custom allreduce on HIP" (#3067)
|
2025-01-22 21:33:35 -08:00 |
|
Hui Liu
|
ddc2001fb0
|
disable custom allreduce on HIP (#3058)
|
2025-01-22 13:57:22 -08:00 |
|
Lianmin Zheng
|
73401fd016
|
Sync distributed package from vllm 0.6.4.post1 (#3010)
|
2025-01-20 04:57:14 -08:00 |
|
Lianmin Zheng
|
89cd923581
|
Roll back to use vllm custom allreduce (#3006)
|
2025-01-20 04:03:15 -08:00 |
|
Chaitanya Sri Krishna Lolla
|
1a820e38a2
|
Remove dependency of pynvml on ROCm (#2995)
|
2025-01-20 13:00:35 +08:00 |
|
yizhang2077
|
24cafe3177
|
add config to swtich from vllm custom allreduce to sgl_kernel custom allreduce (#2981)
|
2025-01-19 22:30:38 +08:00 |
|
yizhang2077
|
767c9dec03
|
adapt custom allreduce for tensorrt llm (#2511)
|
2025-01-16 04:57:35 +08:00 |
|
yizhang2077
|
d5b95cbb53
|
adapt vllm distributed module to sglang (#2244)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-12-01 15:54:52 +08:00 |
|