Byron Hsu
|
96be97bfff
|
Minor PD style fix (#7215)
|
2025-06-15 16:12:12 -07:00 |
|
Byron Hsu
|
88f9c347b2
|
[PD] use int32 for kv indices & get num_reserved_decode_tokens from server_args (#7214)
|
2025-06-15 11:51:03 -07:00 |
|
Byron Hsu
|
7d316991b2
|
[PD] Update prefill.py (#7190)
|
2025-06-14 15:59:54 -07:00 |
|
ishandhanani
|
f1569876d5
|
feat: add direct routing strategy to DP worker (#6884)
|
2025-06-09 11:44:05 -07:00 |
|
dongmao zhang
|
c459536b0f
|
[PD] bug fix: Update status if nixl receiver send a a dummy req. (#6720)
|
2025-05-29 00:01:56 -07:00 |
|
Trevor Morris
|
e806f708c9
|
[PD] Make bootstrap code common between NIXL and Mooncake (#6473)
|
2025-05-27 12:47:38 -07:00 |
|
Yuan Luo
|
30ca18f423
|
Refactor group_concurrent_contiguous in NIXL (#6214)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
|
2025-05-21 11:55:04 +08:00 |
|
shangmingc
|
f1c896007a
|
[PD] Add support for different TP sizes per DP rank (#5922)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-05-12 13:55:42 -07:00 |
|
Yongtong Wu
|
97ac42b634
|
[PD] NIXL backend Prefill TP & Decode TP+DP (#5681)
|
2025-05-02 22:14:03 +08:00 |
|
Trevor Morris
|
4dce1cc608
|
[PD] Add NIXL transfer backend (#5477)
|
2025-04-22 01:36:12 +08:00 |
|