[Feature] Support Tensor Parallelism and Weight Slicing for Lora (#4274)

Co-authored-by: ShenAo1111 <1377693092@qq.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
This commit is contained in:
aoshen524
2025-03-18 23:33:07 -04:00
committed by GitHub
parent 3196999f63
commit 588865f0e0
13 changed files with 528 additions and 103 deletions

View File

@@ -127,6 +127,12 @@ jobs:
cd test/srt
python3 test_mla_tp.py
- name: Test lora tensor parallelism (TP=2)
timeout-minutes: 10
run: |
cd test/srt/models/lora
python3 test_lora_tp.py
performance-test-1-gpu-part-1:
if: (github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request') &&
github.event.pull_request.draft == false