Fix test and benchmark scripts (#2598)

This commit is contained in:
Lianmin Zheng
2024-12-26 07:56:26 -08:00
committed by GitHub
parent a74d194146
commit dc3bee4815
9 changed files with 27 additions and 21 deletions

View File

@@ -56,6 +56,8 @@ with nvtx.annotate("description", color="color"):
## Other tips
1. You can benchmark a model using dummy weights by only providing the config.json file. This allows for quick testing of model variants without training. To do so, add `--load-format dummy` to the above commands and then you only need a correct `config.json` under the checkpoint folder.
2. You can benchmark a model with modified configs (e.g., less layers) by using `--json-model-override-args`. For example, you can benchmark a model with only 2 layers and 2 kv heads using `python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32 --load-format dummy --json-model-override-args '{"num_hidden_layers": 1, "num_key_value_heads": 1}'`
## Profile with PyTorch Profiler
- To profile a server