[Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py (#3769)

This is the follow-up PR to PR #3189, which continues to refactor sfa into mla and finally remove deepseek_v3_2.py. This is the last PR of deepseek modeling refactoring. After this, all deepseek-related model codes are removed from vllm_ascend. FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with correct accuracy. - vLLM version: v0.11.0rc3 - vLLM main: 83f478bb19 --------- Signed-off-by: whx-sjtu <2952154980@qq.com>
2025-10-30 17:06:38 +08:00
parent eff3e5fc6f
commit f6149f3894
10 changed files with 751 additions and 1935 deletions
--- a/vllm_ascend/patch/worker/init.py
+++ b/vllm_ascend/patch/worker/init.py
@@ -33,3 +33,4 @@ from vllm_ascend.utils import vllm_version_is

 if vllm_version_is("0.11.0"):
    import vllm_ascend.patch.worker.patch_deepseek_mtp  # noqa
+    import vllm_ascend.patch.worker.patch_deepseek_v3_2  # noqa