feat: add concurrency evaluation logic in mmmu benchmark (#5782)

2025-05-01 18:20:08 -07:00
parent d33955d28a
commit c5645e928f
3 changed files with 75 additions and 53 deletions
--- a/benchmark/mmmu/README.md
+++ b/benchmark/mmmu/README.md
@@ -8,13 +8,15 @@ Host the VLM:
 python -m sglang.launch_server --model-path Qwen/Qwen2-VL-7B-Instruct --chat-template qwen2-vl --port 30000
 ```

+It's recommended to reduce the memory usage by appending something like `--mem-fraction-static 0.6` to the command above.
+
 Benchmark:

 ```
-python benchmark/mmmu/bench_sglang.py --port 30000
+python benchmark/mmmu/bench_sglang.py --port 30000 --concurrency 16
 ```

-It's recommended to reduce the memory usage by appending something ike `--mem-fraction-static 0.6` to the command above.
+You can adjust the `--concurrency` to control the number of concurrent OpenAI calls.

 ### Evaluate hf