xc-llm-ascend

Files

whx f6149f3894 [Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py (#3769 )

This is the follow-up PR to PR #3189, which continues to refactor sfa
into mla and finally remove deepseek_v3_2.py. This is the last PR of
deepseek modeling refactoring. After this, all deepseek-related model
codes are removed from vllm_ascend.

FurtherMore, after this PR deepseek v3.2 can run chunk-prefill with
correct accuracy.

- vLLM version: v0.11.0rc3
- vLLM main:
83f478bb19

---------

Signed-off-by: whx-sjtu <2952154980@qq.com>

2025-10-30 17:06:38 +08:00

platform

Upgrade to new vllm commit (#3719 )

2025-10-25 15:36:32 +08:00

worker

[Model][3/N] Refactor sfa into mla and remove deepseek_v3_2.py (#3769 )

2025-10-30 17:06:38 +08:00

__init__.py

[BugFix][main] Fix quantization related mtp bug with patch (#3620 )

2025-10-23 09:54:31 +08:00