Files
xc-llm-ascend/csrc/matmul_allreduce_add_rmsnorm
Trunrain ba9cda9dfd [Kernel] add custom op MatmulAllreduceAddRmsnorm (#4606)
What this PR does / why we need it?
Optimization of the fused operator for Qwen3 32B: Matmul, AllReduce,
Add, and RMSNorm

Does this PR introduce _any_ user-facing change?
No

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: tongrunze <t00574058@china.huawei.com>
Co-authored-by: tongrunze <t00574058@china.huawei.com>
2025-12-10 09:05:33 +08:00
..