[BugFix]Fix precision issue for LoRA feature (#4141)

vLLM version: v0.11.0 vLLM main: vllm-project/vllm ### What this PR does / why we need it? Fix the precision issue of the LoRA feature in vllm-ascend. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? ```bash pytest tests/lora/test_llama_tp.py::test_llama_lora -s ``` <img width="1319" height="879" alt="lora_test" src="https://github.com/user-attachments/assets/2a0b2325-5b05-4bbc-ac03-a7c9f0ad9d4c" /> - vLLM version: v0.12.0 - vLLM main: ad32e3e19c --------- Signed-off-by: hukongyi <hukongyi@cmbchina.com>
2025-12-19 14:22:06 +08:00
parent f952de93df
commit ea8f544ce7
6 changed files with 17 additions and 12 deletions
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -62,6 +62,10 @@ set(VLLM_ASCEND_CUSTOM_OP
 )

 set(VLLM_ASCEND_CUSTOM_OP_EXCLUDE
+    ${KERNEL_FILES}/bgmv_expand.cpp
+    ${KERNEL_FILES}/bgmv_shrink.cpp
+    ${KERNEL_FILES}/sgmv_expand.cpp
+    ${KERNEL_FILES}/sgmv_shrink.cpp
    ${CMAKE_CURRENT_SOURCE_DIR}/csrc/batch_matmul_transpose/op_kernel/batch_matmul_transpose_kernel.cpp
 )