From 2d0045125f0276714682291c5aa5ba6d0ed5265a Mon Sep 17 00:00:00 2001 From: Albert Date: Tue, 18 Mar 2025 15:07:06 +0800 Subject: [PATCH] Fix the incorrect args in benchmark_and_profiling.md (#4542) Signed-off-by: Tianyu Zhou --- docs/references/benchmark_and_profiling.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/references/benchmark_and_profiling.md b/docs/references/benchmark_and_profiling.md index b5105724d..37c45457f 100644 --- a/docs/references/benchmark_and_profiling.md +++ b/docs/references/benchmark_and_profiling.md @@ -35,7 +35,7 @@ python -m sglang.launch_server --model-path meta-llama/Llama-3.1-8B-Instruct # send profiling request from client - python -m sglang.bench_serving --backend sglang --model-path meta-llama/Llama-3.1-8B-Instruct --num-prompts 10 --sharegpt-output-len 100 --profile + python -m sglang.bench_serving --backend sglang --model meta-llama/Llama-3.1-8B-Instruct --num-prompts 10 --sharegpt-output-len 100 --profile ``` Please make sure that the `SGLANG_TORCH_PROFILER_DIR` should be set at both server and client side, otherwise the trace file cannot be generated correctly . A secure way will be setting `SGLANG_TORCH_PROFILER_DIR` in the `.*rc` file of shell (e.g. `~/.bashrc` for bash shells). @@ -59,7 +59,7 @@ For example, when profiling a server, ```bash - python -m sglang.bench_serving --backend sglang --model-path meta-llama/Llama-3.1-8B-Instruct --num-prompts 2 --sharegpt-output-len 100 --profile + python -m sglang.bench_serving --backend sglang --model meta-llama/Llama-3.1-8B-Instruct --num-prompts 2 --sharegpt-output-len 100 --profile ``` This command sets the number of prompts to 2 with `--num-prompts` argument and limits the length of output sequences to 100 with `--sharegpt-output-len` argument, which can generate a small trace file for browser to open smoothly.