[CI] Refator multi-node CI (#3487)
### What this PR does / why we need it? Refactor the multi-machine CI use case. The purpose of this PR is to increase the ease of adding multi-machine CI use cases, allowing developers to add multi-machine cluster model testing use cases (including PD separation) by simply adding a new YAML configuration file. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
@@ -205,7 +205,7 @@ vllm serve /models/deepseek_r1_w8a8 \
|
||||
Run proxy server on the first node:
|
||||
```shell
|
||||
cd /vllm-workspace/vllm-ascend/examples/disaggregated_prefill_v1
|
||||
python toy_proxy_server.py --host 172.19.32.175 --port 1025 --prefiller-hosts 172.19.241.49 --prefiller-port 20002 --decoder-hosts 172.19.123.51 --decoder-ports 20002
|
||||
python load_balance_proxy_server_example.py --host 172.19.32.175 --port 1025 --prefiller-hosts 172.19.241.49 --prefiller-port 20002 --decoder-hosts 172.19.123.51 --decoder-ports 20002
|
||||
```
|
||||
|
||||
Verification
|
||||
|
||||
@@ -21,6 +21,10 @@ parser.add_argument("--local-device-ids",
|
||||
type=str,
|
||||
required=False,
|
||||
help="local device ids")
|
||||
parser.add_argument("--ranktable-path",
|
||||
type=str,
|
||||
default="./ranktable.json",
|
||||
help="output rank table path")
|
||||
args = parser.parse_args()
|
||||
local_host = args.local_host
|
||||
prefill_device_cnt = args.prefill_device_cnt
|
||||
@@ -130,7 +134,8 @@ ranktable = {
|
||||
}
|
||||
|
||||
if local_rank == '0':
|
||||
with open("ranktable.json", "w") as f:
|
||||
os.makedirs(os.path.dirname(args.ranktable_path), exist_ok=True)
|
||||
with open(args.ranktable_path, "w") as f:
|
||||
json.dump(ranktable, f, indent=4)
|
||||
|
||||
print("gen ranktable.json done")
|
||||
|
||||
Reference in New Issue
Block a user