xc-llm-ascend

Files

shiyuan680 1c4a0468ee 【OPS】qwen3-next support triton chunk_gated_delta_rule ops (#4070 )

### What this PR does / why we need it?
qwen3-next suppot  triton chunk_gated_delta_rule ops

### co-owners
@OsirisDuan

- vLLM version: v0.11.2

Signed-off-by: shiyuan680 <917935075@qq.com>

2025-11-28 20:55:43 +08:00

__init__.py

[MM][Model][Perf] Remove Qwen2.5-VL modeling files and add patch for VisionAttention (#4349 )

2025-11-28 14:23:00 +08:00

patch_distributed.py

[Refactor] refactor patch module (#3555 )

2025-10-21 20:19:46 +08:00

patch_minicpm.py

[Refactor] refactor patch module (#3555 )

2025-10-21 20:19:46 +08:00

patch_multimodal_merge.py

[Refactor] refactor patch module (#3555 )

2025-10-21 20:19:46 +08:00

patch_qwen2_5_vl.py

[MM][Model][Perf] Remove Qwen2.5-VL modeling files and add patch for VisionAttention (#4349 )

2025-11-28 14:23:00 +08:00

patch_roberta.py

[1/N][Refactor] Refactor code to adapt with vllm main (#3612 )

2025-10-24 16:55:08 +08:00

patch_rope.py

[MM][Model][Perf] Remove Qwen2.5-VL modeling files and add patch for VisionAttention (#4349 )

2025-11-28 14:23:00 +08:00

patch_triton.py

【OPS】qwen3-next support triton chunk_gated_delta_rule ops (#4070 )

2025-11-28 20:55:43 +08:00

patch_weight_loader.py

Drop 0.11.0 support (#4377 )

2025-11-24 17:08:20 +08:00