Files
xc-llm-ascend/vllm_ascend
Angazenn 27e0f2c035 [Perf]Add YaRN custom op (#3355)
### What this PR does / why we need it?
YaRN scaling is used to improve long seq accuracy for models like Qwen3.
In vLLM, YaRN scaling refers to `YaRNScalingRotaryEmbedding` class which
inherits from original `RotaryEmbedding`. Although
`YaRNScalingRotaryEmbedding` does not rewrite the `forward` function of
`RotaryEmbedding` , using YaRN on npu still run into the native
implementation of foward in `RotaryEmbedding`, rather than forward_oot
in vLLM-Ascend. Thus I register another custom op here to enable the oot
implementation for YaRN in vLLM-Ascend, similar to #3151 .

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: Angazenn <supperccell@163.com>
2025-10-11 08:36:20 +08:00
..
2025-09-23 10:27:14 +08:00
2025-10-11 08:36:20 +08:00
2025-10-09 10:28:38 +08:00
2025-10-09 19:22:46 +08:00
2025-09-30 03:25:58 +08:00
2025-10-11 08:36:20 +08:00