[Bugfix][LoRA] Fix the issue when enable LoRA + tp + fully_sharded_loras (#6650)

### What this PR does / why we need it? Fix the issue #6143 . ### Does this PR introduce _any_ user-facing change? Allow to start the server with "--enable-lora && --fully-sharded-loras && --tensor_parallel_size 2". ### How was this patch tested? pytest -sv tests/e2e/multicard/2-cards/test_llama32_lora_tp2.py - vLLM version: v0.15.0 - vLLM main: d7e17aaacd --------- Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
2026-03-11 15:43:15 +08:00
parent a7f91fce71
commit 830f39dd70
4 changed files with 113 additions and 9 deletions
--- a/.github/workflows/scripts/config.yaml
+++ b/.github/workflows/scripts/config.yaml
@@ -97,6 +97,8 @@ e2e-multicard-2-cards:
    estimated_time: 400
  - name: tests/e2e/multicard/2-cards/test_ilama_lora_tp2.py
    estimated_time: 60
+  - name: tests/e2e/multicard/2-cards/test_llama32_lora_tp2.py
+    estimated_time: 223
  # Run the test in a separate step to avoid oom
  - name: tests/e2e/multicard/2-cards/test_offline_inference_distributed.py::test_deepseek_multistream_moe_tp2
    estimated_time: 100