[Bugfix] Fix sleep mode level 2 (#1376)

### What this PR does / why we need it? For sleep mode level 2, we discarded model both weights and kv_cache, but the problems is: When we discard weights, we also discard some tensors representing the model state which we called `model.named_buffers()`, such as: `running_mean / running_var` in BatchNorm、rope cos-sin cache ... when we update weights, but forgot to update buffers as well, this will lead to some unknown issue ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: 5963b98b46 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
2025-09-18 19:51:52 +08:00
parent f4e3d22432
commit 01592515b8
2 changed files with 19 additions and 1 deletions
--- a/tests/ut/worker/test_worker_v1.py
+++ b/tests/ut/worker/test_worker_v1.py
@@ -258,7 +258,7 @@ class TestNPUWorker(TestBase):
        # Create worker mock
        with patch.object(NPUWorker, "__init__", lambda x, **kwargs: None):
            worker = NPUWorker()
-
+            worker._sleep_saved_buffers = {}
            # Test wake_up method
            worker.wake_up(tags=["test_tag"])