[CustomOp] Register VocabParallelEmbedding instead of overwrite forward (#2515)

### What this PR does / why we need it?
Register VocabParallelEmbedding instead of overwrite forward

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.10.1.1
- vLLM main:
644d57d531

---------

Signed-off-by: Icey <1790571317@qq.com>
This commit is contained in:
Icey
2025-08-28 08:57:34 +08:00
committed by GitHub
parent 516e14ae6a
commit c578f817ca
5 changed files with 122 additions and 241 deletions

View File

@@ -78,7 +78,7 @@ class VLLMAscendQuantizer:
"vllm_ascend.ops.layernorm.AscendRMSNorm", "forward_oot",
[wrapper_rmsnorm_forward_oot])
VLLMAscendQuantizer.apply_patch(
"vllm.model_executor.layers.vocab_parallel_embedding.VocabParallelEmbedding",
"vllm_ascend.ops.vocab_parallel_embedding.AscendVocabParallelEmbedding",
"__init__", [wrapper_vocab_parallel_embedding_init])
break
VLLMAscendQuantizer.patched = True