[Refactor]3/N Refactor mla_v1.py & extract mla_cp (#4933)

RFC: https://github.com/vllm-project/vllm-ascend/issues/4629
Reason:
The functions related to Cp differ significantly from those of normal
MLA-Attention, but the coupling is quite severe.

Steps:
Isolate PCP and DCP
(1) create a new python file: mla_cp.py
(2) add classes AscendMlaCPImpl and
AscendMlaCPMetadataBuilder,Inheritance AscendMLAImpl and
AscendMLAMetadataBuilder
(3) Remove PCP and DCP-related methods from mla_v1.py to mla_cp.py

vLLM version: v0.12.0

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: wujinyuan1 <wjy9595@qq.com>
Co-authored-by: wujinyuan1 <wjy9595@qq.com>
Co-authored-by: weijinqian0 <1184188277@qq.com>
This commit is contained in:
wujinyuan1
2025-12-15 12:59:18 +08:00
committed by GitHub
parent 98b9e2e18e
commit 545e856971
3 changed files with 1359 additions and 719 deletions

View File

@@ -912,7 +912,6 @@ class TestAscendMLAImpl(TestBase):
self.assertIsNotNone(self.impl.kv_a_proj_with_mqa)
self.assertIsNotNone(self.impl.kv_a_layernorm)
self.assertEqual(self.impl.num_queries_per_kv, 32)
self.assertEqual(self.impl.tp_size, 2)
def test_q_proj_and_k_up_proj(self):
batch_size = 4

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff