Files
InternVL2-8B/eval_llm_benchmark.log

54 lines
32 KiB
Plaintext
Raw Normal View History

/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl_eval/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl_eval/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
model path is /mnt/petrelfs/wangweiyun/workspace_cz/InternVL/internvl_chat_dev/share_internvl/InternVL2-8B
09/30 19:08:03 - OpenCompass - WARNING - No previous results to reuse!
09/30 19:08:03 - OpenCompass - INFO - Reusing experiements from 20240930_190803
09/30 19:08:03 - OpenCompass - INFO - Current exp folder: /mnt/petrelfs/wangweiyun/workspace_cz/InternVL/internvl_chat_dev/share_internvl/InternVL2-8B/20240930_190803
09/30 19:08:06 - OpenCompass - INFO - Partitioned into 64 tasks.
[ ] 0/64, elapsed: 0s, ETA: [ ] 1/64, 0.0 task/s, elapsed: 911s, ETA: 57413s [ ] 2/64, 0.0 task/s, elapsed: 1067s, ETA: 33069s [> ] 3/64, 0.0 task/s, elapsed: 1079s, ETA: 21947s [> ] 4/64, 0.0 task/s, elapsed: 1083s, ETA: 16242s [>> ] 5/64, 0.0 task/s, elapsed: 1087s, ETA: 12827s [>> ] 6/64, 0.0 task/s, elapsed: 1112s, ETA: 10746s [>>> ] 7/64, 0.0 task/s, elapsed: 1147s, ETA: 9336s [>>> ] 8/64, 0.0 task/s, elapsed: 1147s, ETA: 8028s [>>>> ] 9/64, 0.0 task/s, elapsed: 1194s, ETA: 7298s [>>>> ] 10/64, 0.0 task/s, elapsed: 1197s, ETA: 6465s [>>>>> ] 11/64, 0.0 task/s, elapsed: 1208s, ETA: 5819s [>>>>> ] 12/64, 0.0 task/s, elapsed: 1218s, ETA: 5278s [>>>>>> ] 13/64, 0.0 task/s, elapsed: 1229s, ETA: 4823s [>>>>>> ] 14/64, 0.0 task/s, elapsed: 1233s, ETA: 4403s [>>>>>>> ] 15/64, 0.0 task/s, elapsed: 1251s, ETA: 4086s [>>>>>>> ] 16/64, 0.0 task/s, elapsed: 1252s, ETA: 3755s [>>>>>>> ] 17/64, 0.0 task/s, elapsed: 1261s, ETA: 3485s [>>>>>>>> ] 18/64, 0.0 task/s, elapsed: 1262s, ETA: 3225s [>>>>>>>> ] 19/64, 0.0 task/s, elapsed: 1266s, ETA: 2998s [>>>>>>>>> ] 20/64, 0.0 task/s, elapsed: 1267s, ETA: 2787s [>>>>>>>>> ] 21/64, 0.0 task/s, elapsed: 1272s, ETA: 2604s [>>>>>>>>>> ] 22/64, 0.0 task/s, elapsed: 1275s, ETA: 2433s [>>>>>>>>>> ] 23/64, 0.0 task/s, elapsed: 1288s, ETA: 2296s [>>>>>>>>>>> ] 24/64, 0.0 task/s, elapsed: 1292s, ETA: 2153s [>>>>>>>>>>> ] 25/64, 0.0 task/s, elapsed: 1301s, ETA: 2029s [>>>>>>>>>>>> ] 26/64, 0.0 task/s, elapsed: 1312s, ETA: 1917s [>>>>>>>>>>>> ] 27/64, 0.0 task/s, elapsed: 1312s, ETA: 1798s [>>>>>>>>>>>>> ] 28/64, 0.0 task/s, elapsed: 1317s, ETA: 1694s [>>>>>>>>>>>>> ] 29/64, 0.0 task/s, elapsed: 1333s, ETA: 1608s [>>>>>>>>>>>>>> ] 30/64, 0.0 task/s, elapsed: 1337s, ETA: 1515s [>>>>>>>>>>>>>> ] 31/64, 0.0 task/s, elapsed: 1347s, ETA: 1434s [>>>>>>>>>>>>>>> ] 32/64, 0.0 task/s, elapsed: 1354s, ETA: 1354s [>>>>>>>>>>>>>>> ] 33/64, 0.0 task/s, elapsed: 1365s, ETA: 1282s [>>>>>>>>>>>>>>> ] 34/64, 0.0 task/s, elapsed: 1366s, ETA: 1205s [>>>>>>>>>>>>>>>> ] 35/64, 0.0 task/s, elapsed: 1372s, ETA: 1137s [>>>>>>>>>>>>>>>> ] 36/64, 0.0 task/s, elapsed: 1376s, ETA: 1070s [>>>>>>>>>>>>>>>>> ] 37/64, 0.0 task/s, elapsed: 1397s, ETA: 1019s [>>>>>>>>>>>>>>>>> ] 38/64, 0.0 task/s, elapsed: 1397s, ETA: 956s [>>>>>>>>>>>>>>>>>> ] 39/64, 0.0 task/s, elapsed: 1402s, ETA: 899s [>>>>>>>>>>>>>>>>>> ] 40/64, 0.0 task/s, elapsed: 1416s, ETA: 849s [>>>>>>>>>>>>>>>>>>> ] 41/64, 0.0 task/s, elapsed: 1442s, ETA: 809s [>>>>>>>>>>>>>>>>>>> ] 42/64, 0.0 task/s, elapsed: 1458s, ETA: 764s [>>>>>>>>>>>>>>>>>>>> ] 43/64, 0.0 task/s, elapsed: 1468s, ETA: 717s [>>>>>>>>>>>>>>>>>>>> ] 44/64, 0.0 task/s, elapsed: 1476s, ETA: 671s [>>>>>>>>>>>>>>>>>>>>> ] 45/64, 0.0 task/s, elapsed: 1512s, ETA: 638s [>>>>>>>>>>>>>>>>>>>>> ] 46/64, 0.0 task/s, elapsed: 1550s, ETA: 606s [>>>>>>>>>>>>>>>>>>>>>> ] 47/64, 0.0 task/s, elapsed: 1567s, ETA: 567s [>>>>>>>>>>>>>>>>>>>>>> ] 48/64, 0.0 task/s, elapsed: 1583s, ETA: 528s [>>>>>>>>>>>>>>>>>>>>>> ] 49/64, 0.0 task/s, elapsed: 1607s, ETA: 492s [>>>>>>>>>>>>>>>>>>>>>>> ] 50/64, 0.0 task/s, elapsed: 1635s, ETA: 458s [>>>>>>>>>>>>>>>>>
09/30 19:52:33 - OpenCompass - INFO - Partitioned into 287 tasks.
[ ] 0/287, elapsed: 0s, ETA: [ ] 1/287, 0.0 task/s, elapsed: 31s, ETA: 8982s [ ] 2/287, 0.1 task/s, elapsed: 31s, ETA: 4487s [ ] 3/287, 0.1 task/s, elapsed: 32s, ETA: 2983s [ ] 4/287, 0.1 task/s, elapsed: 32s, ETA: 2232s [ ] 5/287, 0.2 task/s, elapsed: 32s, ETA: 1780s [ ] 6/287, 0.2 task/s, elapsed: 32s, ETA: 1480s [ ] 7/287, 0.2 task/s, elapsed: 32s, ETA: 1265s [ ] 8/287, 0.3 task/s, elapsed: 32s, ETA: 1103s [> ] 9/287, 0.3 task/s, elapsed: 32s, ETA: 977s [> ] 10/287, 0.3 task/s, elapsed: 32s, ETA: 877s [> ] 11/287, 0.3 task/s, elapsed: 32s, ETA: 794s [> ] 12/287, 0.4 task/s, elapsed: 32s, ETA: 725s [> ] 13/287, 0.4 task/s, elapsed: 32s, ETA: 669s [> ] 14/287, 0.4 task/s, elapsed: 32s, ETA: 619s [> ] 15/287, 0.5 task/s, elapsed: 32s, ETA: 576s [> ] 16/287, 0.5 task/s, elapsed: 32s, ETA: 540s [> ] 17/287, 0.5 task/s, elapsed: 32s, ETA: 506s [> ] 18/287, 0.6 task/s, elapsed: 32s, ETA: 479s [>> ] 19/287, 0.6 task/s, elapsed: 32s, ETA: 453s [>> ] 20/287, 0.6 task/s, elapsed: 32s, ETA: 429s [>> ] 21/287, 0.7 task/s, elapsed: 32s, ETA: 407s [>> ] 22/287, 0.7 task/s, elapsed: 32s, ETA: 388s [>> ] 23/287, 0.7 task/s, elapsed: 32s, ETA: 370s [>> ] 24/287, 0.7 task/s, elapsed: 32s, ETA: 353s [>> ] 25/287, 0.8 task/s, elapsed: 32s, ETA: 338s [>> ] 26/287, 0.8 task/s, elapsed: 32s, ETA: 325s [>> ] 27/287, 0.8 task/s, elapsed: 32s, ETA: 312s [>>> ] 28/287, 0.9 task/s, elapsed: 32s, ETA: 299s [>>> ] 29/287, 0.9 task/s, elapsed: 32s, ETA: 288s [>>> ] 30/287, 0.9 task/s, elapsed: 32s, ETA: 278s [>>> ] 31/287, 1.0 task/s, elapsed: 32s, ETA: 268s [>>> ] 32/287, 1.0 task/s, elapsed: 32s, ETA: 259s [>>> ] 33/287, 1.0 task/s, elapsed: 32s, ETA: 250s [>>> ] 34/287, 1.0 task/s, elapsed: 33s, ETA: 242s [>>> ] 35/287, 1.1 task/s, elapsed: 33s, ETA: 235s [>>> ] 36/287, 1.1 task/s, elapsed: 33s, ETA: 227s [>>> ] 37/287, 1.1 task/s, elapsed: 33s, ETA: 220s [>>>> ] 38/287, 1.2 task/s, elapsed: 33s, ETA: 214s [>>>> ] 39/287, 1.2 task/s, elapsed: 33s, ETA: 207s [>>>> ] 40/287, 1.2 task/s, elapsed: 33s, ETA: 202s [>>>> ] 41/287, 1.3 task/s, elapsed: 33s, ETA: 196s [>>>> ] 42/287, 1.3 task/s, elapsed: 33s, ETA: 191s [>>>> ] 43/287, 1.3 task/s, elapsed: 33s, ETA: 185s [>>>> ] 44/287, 1.3 task/s, elapsed: 33s, ETA: 181s [>>>> ] 45/287, 1.4 task/s, elapsed: 33s, ETA: 176s [>>>> ] 46/287, 1.4 task/s, elapsed: 33s, ETA: 171s [>>>>> ] 47/287, 1.4 task/s, elapsed: 33s, ETA: 167s [>>>>> ] 48/287, 1.5 task/s, elapsed: 33s, ETA: 163s [>>>>> ] 49/287, 1.5 task/s, elapsed: 33s, ETA: 159s [>>>>> ] 50/287, 1.5 task/s, elapsed: 33s, ETA: 155s [>>>>>
dataset version metric mode internvl-chat-20b
---------------------------- --------- ---------------------------- ------ -------------------
mmlu - naive_average gen 73.17
cmmlu - naive_average gen 79.21
ceval - naive_average gen 80.14
agieval - - - -
GaokaoBench - weighted_average gen 74.99
triviaqa 2121ce score gen 62.03
triviaqa_wiki_1shot - - - -
nq 3dcea1 score gen 28.12
C3 8c358f accuracy gen 94.19
race-high 9a54b6 accuracy gen 90.82
flores_100 - - - -
winogrande b36770 accuracy gen 85.87
hellaswag e42710 accuracy gen 94.91
bbh - naive_average gen 72.67
gsm8k 1d7fe4 accuracy gen 75.59
math 393424 accuracy gen 39.50
TheoremQA 6f0af8 score gen 15.62
MathBench - - - -
openai_humaneval 8e312c humaneval_pass@1 gen 69.51
humanevalx - - - -
sanitized_mbpp a447ff score gen 58.75
mbpp_cn 6fb572 score gen 48.20
leval - - - -
leval_closed - - - -
leval_open - - - -
longbench - - - -
longbench_single-document-qa - - - -
longbench_multi-document-qa - - - -
longbench_summarization - - - -
longbench_few-shot-learning - - - -
longbench_synthetic-tasks - - - -
longbench_code-completion - - - -
teval - - - -
teval_zh - - - -
IFEval 3321a3 Prompt-level-strict-accuracy gen 52.31
IFEval 3321a3 Inst-level-strict-accuracy gen 62.71
IFEval 3321a3 Prompt-level-loose-accuracy gen 54.90
IFEval 3321a3 Inst-level-loose-accuracy gen 64.87
09/30 19:55:16 - OpenCompass - INFO - write summary to /mnt/petrelfs/wangweiyun/workspace_cz/InternVL/internvl_chat_dev/share_internvl/InternVL2-8B/20240930_190803/summary/summary_20240930_190803.txt
09/30 19:55:16 - OpenCompass - INFO - write csv to /mnt/petrelfs/wangweiyun/workspace_cz/InternVL/internvl_chat_dev/share_internvl/InternVL2-8B/20240930_190803/summary/summary_20240930_190803.csv