sglang

Files

Lifu Huang 6e2da51561 Replace time.time() to time.perf_counter() for benchmarking. (#6178 )

Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>

2025-05-11 14:32:49 -07:00

bench_hf.py

2025-05-02 00:17:21 -07:00

bench_sglang.py

2025-05-11 14:32:49 -07:00

data_utils.py

2025-03-27 19:45:02 -07:00

eval_utils.py

2025-05-01 18:20:08 -07:00

internvl_utils.py

2025-05-02 00:17:21 -07:00

prompt_format.yaml

2025-02-22 08:10:59 -08:00

README.md

2025-05-11 00:14:09 +08:00

Run evaluation

Host the VLM:

python -m sglang.launch_server --model-path Qwen/Qwen2-VL-7B-Instruct --port 30000

It's recommended to reduce the memory usage by appending something like --mem-fraction-static 0.6 to the command above.

Benchmark:

python benchmark/mmmu/bench_sglang.py --port 30000 --concurrency 16

You can adjust the --concurrency to control the number of concurrent OpenAI calls.

python benchmark/mmmu/bench_hf.py --model-path Qwen/Qwen2-VL-7B-Instruct