[bugfix_v0.11.0-dev] layerwise D first plan (#3907)

### What this PR does / why we need it?
Refactored the layerwise code to send to the D node first, preventing
P-node hangs due to communication timeouts when DP > 1.
---------

Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com>
Signed-off-by: liziyu <liziyu16@huawei.com>
Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
Co-authored-by: nwpu-zxr <zhouxuerong2@huawei.com>
Co-authored-by: liziyu <liziyu16@huawei.com>
This commit is contained in:
wangxiaoteng888
2025-10-30 22:21:11 +08:00
committed by GitHub
parent d5a9aba03f
commit af7a56550b
5 changed files with 965 additions and 1356 deletions

View File

@@ -1136,4 +1136,4 @@ class TestMooncakeConnectorWorker(unittest.TestCase):
if __name__ == '__main__':
unittest.main()
unittest.main()

File diff suppressed because it is too large Load Diff