[BUILD] Upgrade torch-npu to 2.5.1 (#661)

### What this PR does / why we need it? The torch-npu 2.5.1 are published: https://pypi.org/project/torch-npu/2.5.1/ It's time to remove all torch-npu dev version from vllm-ascend code base ### Does this PR introduce _any_ user-facing change? Yes, using torch-npu 2.5.1 ### How was this patch tested? - [ ] CI passed - [ ] Manually test - [ ] Grep all `dev2025` --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-04-27 17:28:29 +08:00
parent fa4a5d980e
commit 2e20797934
14 changed files with 43 additions and 88 deletions
--- a/vllm_ascend/models/qwen2_vl.py
+++ b/vllm_ascend/models/qwen2_vl.py
@@ -86,7 +86,7 @@ class CustomQwen2VisionAttention(Qwen2VisionAttention):

        context_layer = torch.torch.empty_like(q)

-        # operator requires pta version >= 2.5.1.dev20250226
+        # operator requires pta version >= 2.5.1
        torch_npu._npu_flash_attention_unpad(
            query=q,
            key=k,