[CustomOp] Register VocabParallelEmbedding instead of overwrite forward (#2515)
### What this PR does / why we need it?
Register VocabParallelEmbedding instead of overwrite forward
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
CI passed with new added/existing test.
- vLLM version: v0.10.1.1
- vLLM main:
644d57d531
---------
Signed-off-by: Icey <1790571317@qq.com>
This commit is contained in:
@@ -78,7 +78,7 @@ class VLLMAscendQuantizer:
|
||||
"vllm_ascend.ops.layernorm.AscendRMSNorm", "forward_oot",
|
||||
[wrapper_rmsnorm_forward_oot])
|
||||
VLLMAscendQuantizer.apply_patch(
|
||||
"vllm.model_executor.layers.vocab_parallel_embedding.VocabParallelEmbedding",
|
||||
"vllm_ascend.ops.vocab_parallel_embedding.AscendVocabParallelEmbedding",
|
||||
"__init__", [wrapper_vocab_parallel_embedding_init])
|
||||
break
|
||||
VLLMAscendQuantizer.patched = True
|
||||
|
||||
Reference in New Issue
Block a user