Files
Qwen3-8B-onpolicy-profiling…/eval_20260409163705/step03_final_eval.csv
ModelHub XC 1a22d07535 初始化项目,由ModelHub XC社区提供模型
Model: CCCCCyx/Qwen3-8B-onpolicy-profiling-adam-20260403_091551
Source: Original Platform
2026-05-17 11:49:30 +08:00

9 lines
578 B
CSV

task,avg_k,pass_k,avg_total_tokens,avg_thinking_tokens,max_thinking_tokens,min_thinking_tokens
gpqa_diamond,0.5812182741116751,0.7614213197969543,10732.243654822336,0.0,0.0,0.0
hmmt2025,0.375,0.5333333333333333,18450.008333333335,0.0,0.0,0.0
aime2024,0.7020833333333333,0.9333333333333333,13840.913541666667,0.0,0.0,0.0
aime2025,0.6010416666666667,0.8666666666666667,15299.15,0.0,0.0,0.0
math500,0.952,0.98,4403.267,0.0,0.0,0.0
minerva,0.484375,0.5661764705882353,6378.409007352941,0.0,0.0,0.0
overall,0.7074036511156186,0.8158640226628895,9194.001521298174,0.0,0.0,0.0