[Bugfix][LoRA] Fix the bug when runs Qwen3-Reranker-0.6B with LoRA. (#7156)

### What this PR does / why we need it? Fix the error that reports while initializing qwen3-reranker-0.6b model with `--enable-lora`. And add a testcase to verify the fix. - vLLM version: v0.17.0 - vLLM main: 4034c3d32e --------- Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>
2026-03-15 17:55:42 +08:00
parent 7daccf4b64
commit 29f195a91c
6 changed files with 108 additions and 4 deletions
--- a/vllm_ascend/ops/linear.py
+++ b/vllm_ascend/ops/linear.py
@@ -433,7 +433,7 @@ class AscendReplicatedLinear(ReplicatedLinear):
        return_bias: bool = True,
        disable_tp: bool = False,
    ):
-        self.custom_op = get_replicated_op(disable_tp, prefix, self)
+        self.custom_op, self.tp_rank, self.tp_size = get_replicated_op(disable_tp, prefix, self)
        # If MergedReplicatedLinear, use output size of each partition.
        if hasattr(self, "output_sizes"):
            self.output_partition_sizes = self.output_sizes