feat: add concurrency evaluation logic in mmmu benchmark (#5782)
This commit is contained in:
@@ -8,13 +8,15 @@ Host the VLM:
|
||||
python -m sglang.launch_server --model-path Qwen/Qwen2-VL-7B-Instruct --chat-template qwen2-vl --port 30000
|
||||
```
|
||||
|
||||
It's recommended to reduce the memory usage by appending something like `--mem-fraction-static 0.6` to the command above.
|
||||
|
||||
Benchmark:
|
||||
|
||||
```
|
||||
python benchmark/mmmu/bench_sglang.py --port 30000
|
||||
python benchmark/mmmu/bench_sglang.py --port 30000 --concurrency 16
|
||||
```
|
||||
|
||||
It's recommended to reduce the memory usage by appending something ike `--mem-fraction-static 0.6` to the command above.
|
||||
You can adjust the `--concurrency` to control the number of concurrent OpenAI calls.
|
||||
|
||||
### Evaluate hf
|
||||
|
||||
|
||||
Reference in New Issue
Block a user