[Refactor] Cleanup platform (#5566)

### What this PR does / why we need it? 1. add `COMPILATION_PASS_KEY` constant 2. clean up useless platform interface `empty_cache`, `synchronize`, `mem_get_info`, `clear_npu_memory` 3. rename `CUSTOM_OP_REGISTERED` to `_CUSTOM_OP_REGISTERED` 4. remove uesless env `VLLM_ENABLE_CUDAGRAPH_GC` NPUPlatform is the interface called by vLLM. Do not call it inner vllm-ascend. ### Does this PR introduce _any_ user-facing change? This PR is just a cleanup. All CI should pass. ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: 7157596103 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2026-01-07 09:25:55 +08:00
parent 6ea2afe5fa
commit 1112208052
9 changed files with 79 additions and 217 deletions
--- a/tests/ut/device_allocator/test_camem.py
+++ b/tests/ut/device_allocator/test_camem.py
@@ -128,10 +128,17 @@ class TestCaMem(PytestBase):
            2000: data2,
        }

-        # mock is_pin_memory_available, return False as some machine only has cpu
-        with patch(
-                "vllm_ascend.device_allocator.camem.NPUPlatform.is_pin_memory_available",
-                return_value=False):
+        # Mock torch.empty to force pin_memory=False
+        original_torch_empty = torch.empty
+
+        def mock_torch_empty(*args, **kwargs):
+            # If pin_memory was explicitly set to True, change it to False
+            if 'pin_memory' in kwargs and kwargs['pin_memory'] is True:
+                kwargs['pin_memory'] = False
+            return original_torch_empty(*args, **kwargs)
+
+        with patch("vllm_ascend.device_allocator.camem.torch.empty",
+                   side_effect=mock_torch_empty):
            allocator.sleep(offload_tags="tag1")

        # only offload tag1, other tag2 call unmap_and_release