fix bug when rotary_dim is not 128 (#2847)

### What this PR does / why we need it? `torch_npu.npu_apply_rotary_pos_emb` only support head_size and rotary_dim equal 128. Error occurs when running GLM ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: main - vLLM main: 404c85ca72 Signed-off-by: realliujiaxu <realliujiaxu@163.com>
2025-09-12 09:49:36 +08:00
parent f5a97e8fa5
commit 778cb72556
1 changed files with 2 additions and 2 deletions
--- a/vllm_ascend/ops/rotary_embedding.py
+++ b/vllm_ascend/ops/rotary_embedding.py
@@ -138,8 +138,8 @@ class AscendRotaryEmbedding(RotaryEmbedding):
        forward_context = get_forward_context()
        is_first_layer = forward_context.is_first_layer
        # Generate cos and sin outside layers to avoid repeated calculation.
-        if is_neox_style and \
-            self.head_size == 128:
+        if is_neox_style and self.head_size == 128 and self.cos_sin_cache.shape[
+                -1] == 128:
            if is_first_layer:
                cos_sin = self.cos_sin_cache.index_select(0, positions)
                last_dim = cos_sin.size()[-1]