feat: add concurrency evaluation logic in mmmu benchmark (#5782)

This commit is contained in:
XinyuanTong
2025-05-01 18:20:08 -07:00
committed by GitHub
parent d33955d28a
commit c5645e928f
3 changed files with 75 additions and 53 deletions

View File

@@ -8,13 +8,15 @@ Host the VLM:
python -m sglang.launch_server --model-path Qwen/Qwen2-VL-7B-Instruct --chat-template qwen2-vl --port 30000
```
It's recommended to reduce the memory usage by appending something like `--mem-fraction-static 0.6` to the command above.
Benchmark:
```
python benchmark/mmmu/bench_sglang.py --port 30000
python benchmark/mmmu/bench_sglang.py --port 30000 --concurrency 16
```
It's recommended to reduce the memory usage by appending something ike `--mem-fraction-static 0.6` to the command above.
You can adjust the `--concurrency` to control the number of concurrent OpenAI calls.
### Evaluate hf