Commit Graph

10 Commits

Author SHA1 Message Date
Byron Hsu
96be97bfff Minor PD style fix (#7215) 2025-06-15 16:12:12 -07:00
Byron Hsu
88f9c347b2 [PD] use int32 for kv indices & get num_reserved_decode_tokens from server_args (#7214) 2025-06-15 11:51:03 -07:00
Byron Hsu
7d316991b2 [PD] Update prefill.py (#7190) 2025-06-14 15:59:54 -07:00
ishandhanani
f1569876d5 feat: add direct routing strategy to DP worker (#6884) 2025-06-09 11:44:05 -07:00
dongmao zhang
c459536b0f [PD] bug fix: Update status if nixl receiver send a a dummy req. (#6720) 2025-05-29 00:01:56 -07:00
Trevor Morris
e806f708c9 [PD] Make bootstrap code common between NIXL and Mooncake (#6473) 2025-05-27 12:47:38 -07:00
Yuan Luo
30ca18f423 Refactor group_concurrent_contiguous in NIXL (#6214)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
2025-05-21 11:55:04 +08:00
shangmingc
f1c896007a [PD] Add support for different TP sizes per DP rank (#5922)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
2025-05-12 13:55:42 -07:00
Yongtong Wu
97ac42b634 [PD] NIXL backend Prefill TP & Decode TP+DP (#5681) 2025-05-02 22:14:03 +08:00
Trevor Morris
4dce1cc608 [PD] Add NIXL transfer backend (#5477) 2025-04-22 01:36:12 +08:00