xc-llm-ascend/vllm_ascend at 80e5812b39bd3b19efa8a6af65658f599bd6cae0 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

GoCHug 80e5812b39 [BugFix] Add support for rotary_dim parameter when using partial rope in rotary_embedding (#6581 )

### What this PR does / why we need it?
Issue: If a model such as Ling-1T adopts partial rotary position
embedding (partial RoPE), but config.json uses the rotary_dim parameter
instead of partial_rotary_factor, it will trigger a RuntimeError: The
expanded size of the tensor (128) must match the existing size (64) at
non-singleton dimension 3.
<img width="1681" height="472" alt="image"
src="https://github.com/user-attachments/assets/ba03d7df-ecba-4d6f-9ec1-4dc55f59799e"
/>

This PR addresses an issue where models using partial rotary position
embedding (partial RoPE) with the `rotary_dim` parameter in
`config.json` (instead of `partial_rotary_factor`) would encounter a
`RuntimeError`.

This change adds support for the `rotary_dim` parameter in
`vllm_ascend/ops/rotary_embedding.py` to correctly calculate the
`rope_dim`, resolving the tensor size mismatch error.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

The patch was tested successfully with the Ling-1T model, which
previously triggered the error.

- vLLM version: v0.15.0
- vLLM main:
d7e17aaacd

Signed-off-by: GoCHug <93277779+GoCHug@users.noreply.github.com>

2026-02-09 17:17:52 +08:00

..

[Refactor]310p_e2e test case update (#6539 )

2026-02-07 09:28:37 +08:00

_cann_ops_custom

[Kernel] add custom op GmmSwigluQuantWeightNzTensorList (#3804 )

2025-11-28 18:06:39 +08:00

[Bugfix] Fix the incorrect use of the output parameter in _forward_fia_slidingwindow (#6469 )

2026-02-05 20:58:54 +08:00

[main2main] upgrade vllm main 0202 (#6560 )

2026-02-05 19:31:17 +08:00

[P/D] layerwise connector support recompute scheduler (#5900 )

2026-02-07 15:24:42 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

device_allocator

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

[Refactor]refactor p2p connector (#6551 )

2026-02-07 09:27:15 +08:00

[EPLB][Bugfix] EPLB support fp/bf16 (#5531 )

2026-01-26 14:28:16 +08:00

[main2main] upgrade vllm main 0202 (#6560 )

2026-02-05 19:31:17 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #5 ) (#5996 )

2026-01-24 22:45:38 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #6 ) (#6001 )

2026-01-24 22:08:33 +08:00

[BugFix] Add support for rotary_dim parameter when using partial rope in rotary_embedding (#6581 )

2026-02-09 17:17:52 +08:00

[Patch] Remove the patch of MiniCPM (#5975 )

2026-02-09 14:07:44 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

[Bugfix] Fix problematic dummy_run & improper input_batch_size in eagle (#6517 )

2026-02-07 09:30:10 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #10 ) (#6173 )

2026-02-06 15:35:06 +08:00

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #10 ) (#6173 )

2026-02-06 15:35:06 +08:00

__init__.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

ascend_config.py

[Refactor] MLP weight prefetch to consistency with MoE Model's prefetching in terms of code and usage (#6442 )

2026-02-04 09:08:18 +08:00

ascend_forward_context.py

[Feat.]: 310p support MOE models (#6530 )

2026-02-06 10:30:56 +08:00

batch_invariant.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

cpu_binding.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

envs.py

[MISC] Clean up useless env USE_OPTIMIZED_MODEL (#6618 )

2026-02-09 15:38:58 +08:00

flash_common3_context.py

[Lint]Style: Convert vllm-ascend/compilation to ruff format (#5912 )

2026-01-16 20:57:46 +08:00

meta_registration.py

[Ops][Refactor] Remove custom rotary_embedding operator (#6523 )

2026-02-07 09:24:05 +08:00

platform.py

[Feat.]: support 310p w8a8 (#6454 )

2026-02-03 14:13:06 +08:00

profiling_config.py

[Core][Misc] Clean up ProfileExecuteDuration (#6461 )

2026-02-01 20:06:01 +08:00

utils.py

[Feat.]: 310p support MOE models (#6530 )

2026-02-06 10:30:56 +08:00