[P/D][BugFix]Fix proxy format processing errors & Layerwise connector performance optimization (#4043)
### What this PR does / why we need it?
1. Fix proxy format processing errors.
2. Layer-wise connector performance optimization.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
By CI.
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
---------
Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com>
Co-authored-by: wangxiaoteng <wangxiaoteng@huawei.com>
This commit is contained in:
@@ -32,6 +32,14 @@ class TestKVCacheSendingLayerThread(unittest.TestCase):
|
||||
self.engine = MagicMock()
|
||||
self.engine.register_memory.return_value = 0
|
||||
self.engine.batch_transfer_sync_write.return_value = 1
|
||||
self._patcher_cs = patch(
|
||||
'vllm_ascend.distributed.mooncake_layerwise_connector.torch_npu.npu.current_stream'
|
||||
)
|
||||
self.mock_current_stream = self._patcher_cs.start()
|
||||
self.addCleanup(self._patcher_cs.stop)
|
||||
fake_stream = MagicMock(name="FakeStream")
|
||||
fake_stream.synchronize = MagicMock()
|
||||
self.mock_current_stream.return_value = fake_stream
|
||||
|
||||
self.first_kv_cache = torch.zeros((2, 2, 2, 8),
|
||||
dtype=torch.float32,
|
||||
|
||||
Reference in New Issue
Block a user