[Feature] Support Tensor Parallelism and Weight Slicing for Lora (#4274)
Co-authored-by: ShenAo1111 <1377693092@qq.com> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
This commit is contained in:
6
.github/workflows/pr-test.yml
vendored
6
.github/workflows/pr-test.yml
vendored
@@ -127,6 +127,12 @@ jobs:
|
||||
cd test/srt
|
||||
python3 test_mla_tp.py
|
||||
|
||||
- name: Test lora tensor parallelism (TP=2)
|
||||
timeout-minutes: 10
|
||||
run: |
|
||||
cd test/srt/models/lora
|
||||
python3 test_lora_tp.py
|
||||
|
||||
performance-test-1-gpu-part-1:
|
||||
if: (github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request') &&
|
||||
github.event.pull_request.draft == false
|
||||
|
||||
Reference in New Issue
Block a user