qwen3_next add triton ops : fused_qkvzba_split_reshape (#4788)

### What this PR does / why we need it?
add triton ops fused_qkvzba_split_reshape_cat for qwen3_next
GatedDeltaNet
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
UT 
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: ZT-AIA <1028681969@qq.com>
Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>
This commit is contained in:
ZT-AIA
2025-12-18 11:31:04 +08:00
committed by GitHub
parent 07014e2101
commit 39fb9e7c83
4 changed files with 237 additions and 1 deletions

View File

@@ -32,5 +32,6 @@ import vllm_ascend.patch.worker.patch_qwen2_5_vl # noqa
import vllm_ascend.patch.worker.patch_qwen2_5_omni # noqa
import vllm_ascend.patch.worker.patch_qwen3_vl # noqa
import vllm_ascend.patch.worker.patch_rope # noqa
import vllm_ascend.patch.worker.patch_qwen3_next # noqa
import vllm_ascend.patch.worker.patch_qwen3_next_mtp # noqa
import vllm_ascend.patch.worker.patch_rejection_sampler # noqa