[Triton][Config] Add muls_add triton kernel and refactor AscendCompilationConfig (#5518)
### What this PR does / why we need it?
Add muls_add triton kernel with related fusion pass. What's more, this
PR refactors `AscendCompilationConfig` and delete `NpugraphExConfig`.
### Does this PR introduce _any_ user-facing change?
None
### How was this patch tested?
CI passed with new added test.
- vLLM version: v0.13.0
- vLLM main:
45c1ca1ca1
---------
Signed-off-by: whx-sjtu <2952154980@qq.com>
This commit is contained in:
@@ -16,8 +16,8 @@ from vllm import LLM
|
||||
model = LLM(
|
||||
model="path/to/Qwen2-7B-Instruct",
|
||||
additional_config={
|
||||
"npugraph_ex_config": {
|
||||
"enable": True,
|
||||
"ascend_compilation_config": {
|
||||
"enable_npugraph_ex": True,
|
||||
"enable_static_kernel": False,
|
||||
}
|
||||
}
|
||||
@@ -29,7 +29,7 @@ Online example:
|
||||
|
||||
```shell
|
||||
vllm serve Qwen/Qwen2-7B-Instruct
|
||||
--additional-config '{"npugraph_ex_config":{"enable":true, "enable_static_kernel":false}}'
|
||||
--additional-config '{"ascend_compilation_config":{"enable_npugraph_ex":true, "enable_static_kernel":false}}'
|
||||
```
|
||||
|
||||
You can find more details about npugraph_ex [here](https://www.hiascend.com/document/detail/zh/Pytorch/730/modthirdparty/torchairuseguide/torchair_00021.html)
|
||||
|
||||
Reference in New Issue
Block a user