sglang

Author	SHA1	Message	Date
Sai Enduri	d0510f08fe	Revert "Fix different device type adjustment in PP" (#8141 )	2025-07-18 01:12:11 -07:00
Qiaolin Yu	3bc43c683e	Fix different device type adjustment in PP (#7760 )	2025-07-15 19:37:14 -07:00
ykcombat	d4d0c7c367	[Feature]TP Group Switching for PD-Multiplexing (#7653 )	2025-07-15 02:35:46 +08:00
TianyuZhang1214	0099172327	feat: use D2D instead of H2H in pp (#7673 ) Co-authored-by: alpha-baby <fujianhao1997@qq.com>	2025-07-03 10:58:50 -07:00
Chunyuan WU	8f844db699	[CPU] fix all_reduce and all_gather (#6770 ) Co-authored-by: blzheng <beilei.zheng@intel.com>	2025-07-02 22:39:45 -07:00
Cheng Wan	8609e637a9	Fix All-Gather under world size one (#7219 )	2025-06-20 14:57:34 -07:00
zyksir	8e3797be1c	support 1 shot allreduce in 1-node and 2-node using mscclpp (#6277 )	2025-06-04 22:11:24 -07:00
Baizhou Zhang	bdaefbbfbd	Add environment flag for disabling message queue broadcaster (#6403 )	2025-05-26 22:32:41 -07:00
Lianmin Zheng	e8e18dcdcc	Revert "fix some typos" (#6244 )	2025-05-12 12:53:26 -07:00
applesaucethebun	d738ab52f8	fix some typos (#6209 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-13 01:42:38 +08:00
Song Zhang	00c2c1f08b	[Feature] Support for Ascend NPU backend (#3853 ) Signed-off-by: Song Zhang <gepin.zs@antgroup.com> Co-authored-by: 22dimensions <waitingwind@foxmail.com>	2025-05-06 20:32:53 -07:00
Lianmin Zheng	9c088829ee	Revert "Use device_id in dist init to reduce NCCL communicator warmup & creation overhead" (#5786 )	2025-04-27 04:03:02 -07:00
Wenxuan Tan	dfb322642f	Use device_id in dist init to reduce NCCL communicator warmup & creation overhead (#5728 )	2025-04-26 18:11:09 -07:00
Kebe	10a9ab7b07	Fix error due to CustomAllreduce setup failure (#4815 ) Signed-off-by: Kebe <mail@kebe7jun.com>	2025-03-27 18:52:10 -07:00
tarinkk	7f19e083c1	Support (1 <= dp < tp) in the dp attention in DeepEP (#4770 ) Co-authored-by: Cheng Wan <cwan39@gatech.edu>	2025-03-27 17:09:35 -07:00
Xiaoyu Zhang	04e3ff6975	Support compressed tensors fp8w8a8 (#4743 )	2025-03-26 13:21:25 -07:00
Cheng Wan	3196999f63	Reduce computation and communication in DP attention (#4521 )	2025-03-18 13:41:36 -07:00
Chen Shengzhi	f1cf6eefbe	[Fix] Check the device backend before calling empty_cache function (#4212 )	2025-03-12 21:28:48 -07:00
Ke Bao	00ce7e311c	Fix all gather torch compile (#3992 ) Co-authored-by: yizhang2077 <1109276519@qq.com>	2025-03-02 00:41:38 -08:00
Nicolas Castet	127998cc41	Fix allgather ops inside cuda graphs (#3709 )	2025-02-25 08:39:10 -08:00
Shenggui Li	c0bb9eb3b3	[improve] made timeout configurable (#3803 )	2025-02-25 00:26:08 -08:00
Lianmin Zheng	ea535dc574	Revert "disable custom allreduce on HIP" (#3067 )	2025-01-22 21:33:35 -08:00
Hui Liu	ddc2001fb0	disable custom allreduce on HIP (#3058 )	2025-01-22 13:57:22 -08:00
Lianmin Zheng	73401fd016	Sync distributed package from vllm 0.6.4.post1 (#3010 )	2025-01-20 04:57:14 -08:00
yizhang2077	d5b95cbb53	adapt vllm distributed module to sglang (#2244 ) Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-12-01 15:54:52 +08:00

25 Commits