[Kernel] add custom moe ops for prefill (#4194)
### What this PR does / why we need it?
1.Add the implementation of normal Aclnn operators: MoeCombineNormal,
MoeDispatchNormal, NotifyDispatch,and DispatchLayout.
- MoeCombineNormal: Implements the combine logic within MoE operations.
- MoeDispatchNormal: Implements the dispatch logic within MoE
operations.
- NotifyDispatch: Exchanges topk_idx information among different ranks
to calculate the device memory required for the dispatch stage.
- DispatchLayout: Used to calculate information related to the device
memory layout for the dispatch stage.
2.Provide PyTorch interfaces for normal operators—get_dispatch_layout,
dispatch_prefill, and combine_prefill—to be used for MoE communication
during the prefill stage in vLLM.
- get_dispatch_layout: Calculates information related to the device
memory layout for the dispatch operator, and is called before
dispatch_prefill.
- dispatch_prefill: Initiates the dispatch operation.
- combine_prefill: Initiates the combine operation.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
The functionality has already been validated using the local Qwen model.
Test cases will be added after support for multi-NPU use cases in the CI
pipeline is finalized.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: shiro-zzzz <zhangdianhao@huawei.com>
This commit is contained in:
24
csrc/utils.h
24
csrc/utils.h
@@ -28,4 +28,28 @@
|
||||
return PyModule_Create(&module); \
|
||||
}
|
||||
|
||||
class TrochBindException : public std::exception
|
||||
{
|
||||
private:
|
||||
std::string message = {};
|
||||
|
||||
public:
|
||||
explicit TrochBindException(const char *name, const char *file, const int line, const std::string &error)
|
||||
{
|
||||
message = std::string("Failed: ") + name + " error " + file + ":" + std::to_string(line) +
|
||||
" error message or error code is '" + error + "'";
|
||||
}
|
||||
|
||||
const char *what() const noexcept override
|
||||
{
|
||||
return message.c_str();
|
||||
}
|
||||
};
|
||||
|
||||
#define TORCH_BIND_ASSERT(cond) \
|
||||
; \
|
||||
do { \
|
||||
if (not(cond)) { \
|
||||
throw TrochBindException("Assertion", __FILE__, __LINE__, #cond); \
|
||||
} \
|
||||
} while (0)
|
||||
|
||||
Reference in New Issue
Block a user