xc-llm-ascend

Files

Trunrain ba9cda9dfd [Kernel] add custom op MatmulAllreduceAddRmsnorm (#4606 )

What this PR does / why we need it?
Optimization of the fused operator for Qwen3 32B: Matmul, AllReduce,
Add, and RMSNorm

Does this PR introduce _any_ user-facing change?
No

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: tongrunze <t00574058@china.huawei.com>
Co-authored-by: tongrunze <t00574058@china.huawei.com>

2025-12-10 09:05:33 +08:00

op_host

[Kernel] add custom op MatmulAllreduceAddRmsnorm (#4606 )

2025-12-10 09:05:33 +08:00

op_kernel

[Kernel] add custom op MatmulAllreduceAddRmsnorm (#4606 )

2025-12-10 09:05:33 +08:00