[v0.11.0][refactor] refactor SequenceRowParallelOp forward (#3654)

### What this PR does / why we need it? This PR refactors SequenceRowParallelOp forward. In order to further expand the operator inclusion scope in dynamic judgment scenarios, this PR customizes the entire matmul computation and communication as a custom operator masking. With this refactor, it will support directly writing code such as common operation fusion into the SequenceRowParallelOp class's member function matmul_and_reduce, without the need to register more redundant custom masking operators. ### How was this patch tested? CI passed with new added/existing test. Signed-off-by: rjg-lyh <1318825571@qq.com>
2025-10-23 14:45:49 +08:00
parent 54bd531db8
commit 74903af460
4 changed files with 56 additions and 4 deletions
--- a/tests/ut/ops/test_linear.py
+++ b/tests/ut/ops/test_linear.py
@@ -4,6 +4,7 @@ from unittest import mock
 from unittest.mock import MagicMock, patch

 import torch
+from vllm import config

 from tests.ut.base import TestBase
 from vllm_ascend import ascend_config
@@ -106,6 +107,9 @@ class TestAscendRowParallelLinear(BaseLinearTest):
        linear(input_tensor)

    def test_oproj_tp(self):
+
+        config._current_vllm_config = MagicMock()
+
        ascend_config._ASCEND_CONFIG = MagicMock()
        ascend_config._ASCEND_CONFIG.oproj_tensor_parallel_size = 2