[P/D][BugFix]Fix proxy format processing errors & Layerwise connector performance optimization (#4043)

### What this PR does / why we need it? 1. Fix proxy format processing errors. 2. Layer-wise connector performance optimization. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By CI. - vLLM version: v0.11.0 - vLLM main: 83f478bb19 --------- Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com> Co-authored-by: wangxiaoteng <wangxiaoteng@huawei.com>
2025-11-08 18:44:06 +08:00
parent 24d6314718
commit 1d81a289d0
3 changed files with 16 additions and 3 deletions
--- a/tests/ut/kv_connector/test_mooncake_layerwise_connector.py
+++ b/tests/ut/kv_connector/test_mooncake_layerwise_connector.py
@@ -32,6 +32,14 @@ class TestKVCacheSendingLayerThread(unittest.TestCase):
        self.engine = MagicMock()
        self.engine.register_memory.return_value = 0
        self.engine.batch_transfer_sync_write.return_value = 1
+        self._patcher_cs = patch(
+            'vllm_ascend.distributed.mooncake_layerwise_connector.torch_npu.npu.current_stream'
+        )
+        self.mock_current_stream = self._patcher_cs.start()
+        self.addCleanup(self._patcher_cs.stop)
+        fake_stream = MagicMock(name="FakeStream")
+        fake_stream.synchronize = MagicMock()
+        self.mock_current_stream.return_value = fake_stream

        self.first_kv_cache = torch.zeros((2, 2, 2, 8),
                                          dtype=torch.float32,