[Kernel] add l2norm triton kernel (#4595)

### What this PR does / why we need it? This pull request introduces an L2 normalization kernel implemented in Triton, specifically optimized for Ascend NPUs. ### Does this PR introduce _any_ user-facing change? No, this PR does not introduce any user-facing changes. ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: bc0a5a0c08 --------- Signed-off-by: Ascendyh <hw7osiris@outlook.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>
2025-12-25 06:06:18 +08:00
parent e54630e01c
commit a90482803d
4 changed files with 106 additions and 1 deletions
--- a/.github/workflows/_e2e_test.yaml
+++ b/.github/workflows/_e2e_test.yaml
@@ -103,6 +103,7 @@ jobs:
          # We found that if running aclgraph tests in batch, it will cause AclmdlRICaptureBegin error. So we run
          # the test separately.

+          pytest -sv --durations=0 tests/e2e/nightly/ops/triton/
          pytest -sv --durations=0 tests/e2e/singlecard/test_completion_with_prompt_embeds.py
          pytest -sv --durations=0 tests/e2e/singlecard/test_aclgraph_accuracy.py
          pytest -sv --durations=0 tests/e2e/singlecard/test_async_scheduling.py