Files
xc-llm-ascend/vllm_ascend/ops
socrahow c3fee66806 [Model] Optimizing gemma3 model's GemmaRMSNorm function (#3151)
### What this PR does / why we need it?
Before optimizing,the rmsnorm time in one decoding is 531.5us. After
optimizing,the rmsnorm time in one decoding is 105us.
I closed the previous
PR(https://github.com/vllm-project/vllm-ascend/pull/2456) by mistake and
resubmitted it now
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?

- vLLM version: v0.10.2
- vLLM main:
b1068903fd

---------

Signed-off-by: socrahow <suzihao4@h-partners.com>
2025-09-28 21:19:10 +08:00
..
2025-09-18 14:09:19 +08:00