[Patch] Remove the patch of MiniCPM (#5975)

### What this PR does / why we need it? Part of #5304. After https://github.com/vllm-project/vllm/pull/32523 merge, we could remove the patch of `MiniCPMAttention`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Test it locally. - vLLM version: v0.13.0 - vLLM main: 2c24bc6996 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com>
2026-02-09 14:07:44 +08:00
parent e5f0e0eaf7
commit b7aa511daa
4 changed files with 0 additions and 128 deletions
--- a/vllm_ascend/patch/init.py
+++ b/vllm_ascend/patch/init.py
@@ -112,20 +112,6 @@
 #       Remove this patch when the refactor of all2all manager is done.
 #       Remove this patch when vLLM support all_reduce as customop.
 #
-# ** 2. File: worker/patch_minicpm.py **
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-#   1. `vllm.model_executor.models.minicpm.MiniCPMAttention.forward`
-#    Why:
-#       The forward func of MiniCPMAttention in vllm do a datatype convert
-#       (original datatype --> float32) to ensure the precision on cuda.
-#       However float32 is not supported in cann rope op, thus we keep this patch
-#    How：
-#       Removed the dtype convert operations in forward
-#    Related PR (if no, explain why):
-#       NO, only for npu due to rope op.
-#    Future Plan:
-#       Keep this patch in vllm-ascend.
-#
 # ** 3. File: worker/patch_multimodal_merge.py**
 # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 #   1. `vllm.model_executor.models.utils._merge_multimodal_embeddings`