Docs/CI: Enable Fake Finish for Docs Only PR (#3350)

2025-02-06 19:33:31 -08:00
parent cdae77b03d
commit 76ca91dff2
5 changed files with 114 additions and 42 deletions
--- a/docs/references/multi_node.md
+++ b/docs/references/multi_node.md
@@ -0,0 +1,25 @@
+# Run Multi-Node Inference
+
+## Llama 3.1 405B
+
+**Run 405B (fp16) on Two Nodes**
+
+```bash
+# replace 172.16.4.52:20000 with your own node ip address and port of the first node
+
+python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instruct --tp 16 --dist-init-addr 172.16.4.52:20000 --nnodes 2 --node-rank 0
+
+# replace 172.18.45.52:20000 with your own node ip address and port of the second node
+
+python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instruct --tp 16 --dist-init-addr 172.18.45.52:20000 --nnodes 2 --node-rank 1
+```
+
+Note that LLama 405B (fp8) can also be launched on a single node.
+
+```bash
+python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instruct-FP8 --tp 8
+```
+
+## DeepSeek V3/R1
+
+Please refer to [DeepSeek documents for reference.](https://docs.sglang.ai/references/deepseek.html#running-examples-on-multi-node).