[CI] Refator multi-node CI (#3487)

### What this PR does / why we need it?
Refactor the multi-machine CI use case. The purpose of this PR is to
increase the ease of adding multi-machine CI use cases, allowing
developers to add multi-machine cluster model testing use cases
(including PD separation) by simply adding a new YAML configuration
file.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
Li Wang
2025-10-17 09:04:31 +08:00
committed by GitHub
parent ccb6fb9ec1
commit 4c4a8458a5
18 changed files with 632 additions and 437 deletions

View File

@@ -66,16 +66,16 @@ Install the relevant dependencies. The installation of Go is not required.
```shell
cd Mooncake
bash dependencies.sh
bash dependencies.sh -y
```
Install mpi
```shell
apt purge mpich libmpich-dev
apt purge openmpi-bin
apt purge openmpi-bin libopenmpi-dev
apt install mpich libmpich-dev
apt purge mpich libmpich-dev -y
apt purge openmpi-bin -y
apt purge openmpi-bin libopenmpi-dev -y
apt install mpich libmpich-dev -y
export CPATH=/usr/lib/aarch64-linux-gnu/mpich/include/:$CPATH
export CPATH=/usr/lib/aarch64-linux-gnu/openmpi/lib:$CPATH
```