xc-llm-ascend/ops at 37db0844f50d25bab94c860e83df2b5be01d6a6e - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

Trunrain ba9cda9dfd [Kernel] add custom op MatmulAllreduceAddRmsnorm (#4606 )

What this PR does / why we need it?
Optimization of the fused operator for Qwen3 32B: Matmul, AllReduce,
Add, and RMSNorm

Does this PR introduce _any_ user-facing change?
No

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: tongrunze <t00574058@china.huawei.com>
Co-authored-by: tongrunze <t00574058@china.huawei.com>

2025-12-10 09:05:33 +08:00

..

[Ops][Triton] Add a triton kernel supporting partial rope. (#4413 )

2025-12-02 17:10:19 +08:00

__init__.py

[CI] Add custom op to nightly (#3765 )

2025-10-27 14:07:03 +08:00

test_batch_matmul_transpose.py

[OPS] add bmm_transpose ops (#3990 )

2025-12-01 09:09:51 +08:00

test_bgmv_expand.py

[CI] Add custom op to nightly (#3765 )

2025-10-27 14:07:03 +08:00

test_bgmv_shrink.py

[CI] Add custom op to nightly (#3765 )

2025-10-27 14:07:03 +08:00

test_dispatch_ffn_combine.py

add dispatch_gmm_combine kernel (#3532 )

2025-12-04 23:00:59 +08:00

test_fused_moe.py

【fix】ops gatingtopk fix nightly ci error (#4340 )

2025-12-04 20:09:21 +08:00

test_gating_top_k_softmax.py

[CI] Add custom op to nightly (#3765 )

2025-10-27 14:07:03 +08:00

test_gmm_swiglu_quant_weight_nz_tensor_list.py

[Kernel] add custom op GmmSwigluQuantWeightNzTensorList (#3804 )

2025-11-28 18:06:39 +08:00

test_grouped_matmul_swiglu_quant.py

[feature] Add Custom Op grouped_matmul_swiglu_quant (#4431 )

2025-11-27 21:56:18 +08:00

test_matmul_allreduce_add_rmsnorm.py

[Kernel] add custom op MatmulAllreduceAddRmsnorm (#4606 )

2025-12-10 09:05:33 +08:00

test_mla_preprocess_qdown.py

mlapo add qdown output (#4707 )

2025-12-06 11:18:53 +08:00

test_mla_preprocess.py

[CI] Add custom op to nightly (#3765 )

2025-10-27 14:07:03 +08:00

test_rotary_embedding.py

[CI] Add custom op to nightly (#3765 )

2025-10-27 14:07:03 +08:00

test_vocabparallelembedding.py

[CI] Add custom op to nightly (#3765 )

2025-10-27 14:07:03 +08:00