### What this PR does / why we need it?
1.Add the implementation of normal Aclnn operators: MoeCombineNormal,
MoeDispatchNormal, NotifyDispatch,and DispatchLayout.
- MoeCombineNormal: Implements the combine logic within MoE operations.
- MoeDispatchNormal: Implements the dispatch logic within MoE
operations.
- NotifyDispatch: Exchanges topk_idx information among different ranks
to calculate the device memory required for the dispatch stage.
- DispatchLayout: Used to calculate information related to the device
memory layout for the dispatch stage.
2.Provide PyTorch interfaces for normal operators—get_dispatch_layout,
dispatch_prefill, and combine_prefill—to be used for MoE communication
during the prefill stage in vLLM.
- get_dispatch_layout: Calculates information related to the device
memory layout for the dispatch operator, and is called before
dispatch_prefill.
- dispatch_prefill: Initiates the dispatch operation.
- combine_prefill: Initiates the combine operation.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
The functionality has already been validated using the local Qwen model.
Test cases will be added after support for multi-NPU use cases in the CI
pipeline is finalized.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: shiro-zzzz <zhangdianhao@huawei.com>
50 lines
1.5 KiB
CMake
50 lines
1.5 KiB
CMake
# Copyright (c) 2025 Huawei Technologies Co., Ltd.
|
|
# This file is a part of the CANN Open Software.
|
|
# Licensed under CANN Open Software License Agreement Version 1.0 (the "License").
|
|
# Please refer to the License for details. You may not use this file except in compliance with the License.
|
|
# THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED,
|
|
# INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
|
|
# See LICENSE in the root of the software repository for the full text of the License.
|
|
# ======================================================================================================================
|
|
|
|
add_ops_compile_options(
|
|
OP_NAME NotifyDispatch
|
|
OPTIONS --cce-auto-sync=off
|
|
-Wno-deprecated-declarations
|
|
-Werror
|
|
)
|
|
|
|
target_sources(op_host_aclnnInner PRIVATE
|
|
notify_dispatch.cpp
|
|
)
|
|
|
|
target_sources(opapi PRIVATE
|
|
aclnn_notify_dispatch.cpp
|
|
)
|
|
|
|
if (NOT BUILD_OPEN_PROJECT)
|
|
target_sources(aclnn_ops_train PRIVATE
|
|
aclnn_notify_dispatch.cpp
|
|
)
|
|
|
|
target_sources(aclnn_ops_infer PRIVATE
|
|
aclnn_notify_dispatch.cpp
|
|
)
|
|
endif ()
|
|
|
|
target_sources(optiling PRIVATE
|
|
notify_dispatch_tiling.cpp
|
|
)
|
|
|
|
target_include_directories(optiling PRIVATE
|
|
${CMAKE_CURRENT_SOURCE_DIR}
|
|
)
|
|
|
|
target_sources(opsproto PRIVATE)
|
|
|
|
file(GLOB _GMM_Aclnn_header "${CMAKE_CURRENT_SOURCE_DIR}/aclnn_notify_dispatch.h")
|
|
|
|
install(FILES ${_GMM_Aclnn_header}
|
|
DESTINATION ${ACLNN_INC_INSTALL_DIR} OPTIONAL
|
|
)
|