[feature] Add Custom Op grouped_matmul_swiglu_quant (#4431)
This PR introduces the `EXEC_NPU_CMD` macro, serving as an adapter layer to simplify the invocation of `aclnn` operators on Ascend NPUs. **Key Changes:** * **Adapter Layer:** Added `EXEC_NPU_CMD` macro and related dependencies to standardize `aclnn` calls. * **Operator Support:** Integrated `grouped_matmul_swiglu_quant` as a reference implementation to demonstrate the usage of the new macro. --- - vLLM version: v0.11.2 --------- Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
This commit is contained in:
1
.github/workflows/release_whl.yml
vendored
1
.github/workflows/release_whl.yml
vendored
@@ -98,6 +98,7 @@ jobs:
|
||||
--exclude libc_sec.so \
|
||||
--exclude "libascend*.so" \
|
||||
--exclude "libtorch*.so" \
|
||||
--exclude "libopapi.so" \
|
||||
--exclude "liberror_manager.so"
|
||||
done
|
||||
rm -f dist/*.whl
|
||||
|
||||
Reference in New Issue
Block a user