Add patch_qwen3_5 for triton ops fused_recurrent_gated_delta_rule (#7109)

### What this PR does / why we need it?

The ops `torch_npu.npu_recurrent_gated_delta_rule` currently does not
support `ssm_state` inputs in float32 format,
we temporarily retain the _forward_core implementation with triton for
Qwen3_5

---------

Signed-off-by: pppeng <zepengliu912@qq.com>
Signed-off-by: pppeng <60355449+ppppeng@users.noreply.github.com>
This commit is contained in:
pppeng
2026-03-10 23:28:58 +08:00
committed by GitHub
parent a78a00e0b1
commit 0f289fa2a8
4 changed files with 275 additions and 0 deletions

View File

@@ -189,6 +189,8 @@ class AscendEagleProposer(EagleProposer):
"Qwen2_5_VLForConditionalGeneration",
"Qwen3VLForConditionalGeneration",
"Qwen3VLMoeForConditionalGeneration",
"Qwen3_5ForConditionalGeneration",
"Qwen3_5MoeForConditionalGeneration",
]:
self.model.config.image_token_index = model.config.image_token_id
elif self.get_model_name(model) == "PixtralForConditionalGeneration":