[Kernel] add custom op DispatchGmmCombineDecode (#4139)

#### What this PR does / why we need it? add custom opapi DispatchGmmCombineDecode for A3, include kernel inpl, python Api, pytest. vLLM version: v0.11.0 vLLM main: 24d6314718 - vLLM version: v0.12.0 - vLLM main: ad32e3e19c Signed-off-by: wangqiankun <wangqiankun13@huawei.com> Co-authored-by: wangqiankun <wangqiankun13@huawei.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-12-06 17:33:14 +08:00
parent cb42564942
commit 4bd1030842
29 changed files with 7851 additions and 27 deletions
--- a/docs/source/installation.md
+++ b/docs/source/installation.md
@@ -163,6 +163,7 @@ cd ..
 ```

 vllm-ascend will build custom operators by default. If you don't want to build it, set `COMPILE_CUSTOM_KERNELS=0` environment to disable it.
+If you are building custom operators for Atlas A3, you should run `git submodule update --init --recursive` manually, or ensure your environment has Internet access.
 :::

 ```{note}