Fix test and benchmark scripts (#2598)
This commit is contained in:
@@ -56,6 +56,8 @@ with nvtx.annotate("description", color="color"):
|
||||
## Other tips
|
||||
|
||||
1. You can benchmark a model using dummy weights by only providing the config.json file. This allows for quick testing of model variants without training. To do so, add `--load-format dummy` to the above commands and then you only need a correct `config.json` under the checkpoint folder.
|
||||
2. You can benchmark a model with modified configs (e.g., less layers) by using `--json-model-override-args`. For example, you can benchmark a model with only 2 layers and 2 kv heads using `python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32 --load-format dummy --json-model-override-args '{"num_hidden_layers": 1, "num_key_value_heads": 1}'`
|
||||
|
||||
|
||||
## Profile with PyTorch Profiler
|
||||
- To profile a server
|
||||
|
||||
Reference in New Issue
Block a user