bugfix(MC2): refactor the comm group of MC2 to be compatible with PP (#7291)
### What this PR does / why we need it?
This PR refactors the communication group of MC2 to keep it consistent
with vllm's EP group, making it compatible with PP.
- vLLM version: v0.17.0
- vLLM main:
4034c3d32e
---------
Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
This commit is contained in:
4
.github/workflows/scripts/config.yaml
vendored
4
.github/workflows/scripts/config.yaml
vendored
@@ -127,8 +127,6 @@ e2e-multicard-2-cards:
|
||||
estimated_time: 180
|
||||
- name: tests/e2e/multicard/2-cards/test_offline_inference_distributed.py::test_qwen3_w4a4_distributed_tp2
|
||||
estimated_time: 202
|
||||
- name: tests/e2e/multicard/2-cards/test_pipeline_parallel.py
|
||||
estimated_time: 357
|
||||
- name: tests/e2e/multicard/2-cards/test_prefix_caching.py
|
||||
estimated_time: 470
|
||||
- name: tests/e2e/multicard/2-cards/test_quantization.py
|
||||
@@ -165,3 +163,5 @@ e2e-multicard-4-cards:
|
||||
is_skipped: true
|
||||
- name: tests/e2e/multicard/4-cards/spec_decode/test_mtp_qwen3_next.py
|
||||
estimated_time: 1340
|
||||
- name: tests/e2e/multicard/4-cards/test_pipeline_parallel.py
|
||||
estimated_time: 357
|
||||
|
||||
Reference in New Issue
Block a user