From 76d14f8cb92c73ac75a1d859a088629670de4290 Mon Sep 17 00:00:00 2001 From: Lzhang-hub <57925599+Lzhang-hub@users.noreply.github.com> Date: Mon, 30 Dec 2024 13:04:38 +0800 Subject: [PATCH] add 2*h20 node serving example for deepseek v3 (#2650) Co-authored-by: Yineng Zhang --- benchmark/deepseek_v3/README.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/benchmark/deepseek_v3/README.md b/benchmark/deepseek_v3/README.md index b876ba133..0343de33b 100644 --- a/benchmark/deepseek_v3/README.md +++ b/benchmark/deepseek_v3/README.md @@ -51,6 +51,16 @@ response = client.chat.completions.create( ) print(response) ``` +### Example serving with 2 H20*8 +For example, there are two H20 nodes, each with 8 GPUs. The first node's IP is `10.0.0.1`, and the second node's IP is `10.0.0.2`. + +```bash +# node 1 +GLOO_SOCKET_IFNAME=eth0 python -m sglang.launch_server --model-path deepseek-ai/DeepSeek-V3 --tp 16 --nccl-init 10.0.0.1:5000 --nnodes 2 --node-rank 0 --trust-remote-code + +# node 2 +GLOO_SOCKET_IFNAME=eth0 python -m sglang.launch_server --model-path deepseek-ai/DeepSeek-V3 --tp 16 --nccl-init 10.0.0.1:5000 --nnodes 2 --node-rank 1 --trust-remote-code +``` ## DeepSeek V3 Optimization Plan