LLMdatadist connector adapt the distributed KV aggregation (#2718)
### What this PR does / why we need it?
LLMdatadist connector adapt the distributed KV aggregation for the main
branch. Change the P node from returning "finish sending" only when TP0
responds to returning "finish sending" as soon as each NPU receives it.
The D node will send a finish receive signal to the corresponding tp
rank of the P node.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
gsm8k test
2*A3 1P 1D
P: dp2 tp8 D:dp 4 tp4
P: dp2 tp8 D:dp 2 tp8
- vLLM version: main
- vLLM main:
cc99baf14d
Signed-off-by: liziyu <liziyu16@huawei.com>
This commit is contained in:
1
.github/workflows/vllm_ascend_test_pd.yaml
vendored
1
.github/workflows/vllm_ascend_test_pd.yaml
vendored
@@ -108,4 +108,5 @@ jobs:
|
||||
|
||||
- name: Run vllm-project/vllm-ascend PD Disaggregation edge test
|
||||
run: |
|
||||
git config --global --add safe.directory/__w/vllm-ascend/vllm-ascend
|
||||
bash tests/e2e/pd_disaggreate/run_edge_case_test.sh
|
||||
Reference in New Issue
Block a user