Add bench_server_latency.py (#1452)

This commit is contained in:
Lianmin Zheng
2024-09-18 00:56:06 -07:00
committed by GitHub
parent 5752f25eef
commit 5e62a6b706
5 changed files with 210 additions and 15 deletions

View File

@@ -1,5 +1,7 @@
"""
Benchmark the latency of a given model. It accepts arguments similar to those of launch_server.py.
Benchmark the latency of running a single static batch.
This script does not launch a server and uses the low-level APIs.
It accepts arguments similar to those of launch_server.py.
# Usage (latency test)
## with dummy weights: