Commit Graph

2 Commits

Author SHA1 Message Date
simveit
bb121214c2 Variance measure for reasoning benchmark (#3677) 2025-02-20 03:49:49 +08:00
simveit
3d4a8f9bc0 Benchmark for reasoning models (#3532)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
2025-02-17 03:07:30 +08:00