[Triton][Config] Add muls_add triton kernel and refactor AscendCompilationConfig (#5518)

### What this PR does / why we need it? Add muls_add triton kernel with related fusion pass. What's more, this PR refactors `AscendCompilationConfig` and delete `NpugraphExConfig`. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? CI passed with new added test. - vLLM version: v0.13.0 - vLLM main: 45c1ca1ca1 --------- Signed-off-by: whx-sjtu <2952154980@qq.com>
2026-03-02 17:54:25 +08:00
parent 8547520726
commit 16c879cdf7
14 changed files with 290 additions and 98 deletions
--- a/docs/source/user_guide/feature_guide/npugraph_ex.md
+++ b/docs/source/user_guide/feature_guide/npugraph_ex.md
@@ -16,8 +16,8 @@ from vllm import LLM
 model = LLM(
    model="path/to/Qwen2-7B-Instruct",
    additional_config={
-        "npugraph_ex_config": {
-            "enable": True,
+        "ascend_compilation_config": {
+            "enable_npugraph_ex": True,
            "enable_static_kernel": False,
        }
    }
@@ -29,7 +29,7 @@ Online example:

 ```shell
 vllm serve Qwen/Qwen2-7B-Instruct
--additional-config '{"npugraph_ex_config":{"enable":true, "enable_static_kernel":false}}'
+--additional-config '{"ascend_compilation_config":{"enable_npugraph_ex":true, "enable_static_kernel":false}}'
 ```

 You can find more details about npugraph_ex [here](https://www.hiascend.com/document/detail/zh/Pytorch/730/modthirdparty/torchairuseguide/torchair_00021.html)