Model synced from source: jerchenxin/qwen2.5-Math-1.5B-reinforce_with_baseline-full-step_0000120
Updated 2026-05-08 23:02:59 +08:00
Model synced from source: jerchenxin/qwen2.5-Math-1.5B-step-400
Updated 2026-05-08 22:22:05 +08:00
Model synced from source: jerchenxin/qwen2.5-Math-1.5B-no-baseline-full-step-40
Updated 2026-05-05 03:35:43 +08:00
Model synced from source: jerchenxin/qwen2.5-Math-1.5B-reinforce_with_baseline-full-step_0000040
Updated 2026-05-05 02:59:21 +08:00
Model synced from source: jerchenxin/qwen2.5-Math-1.5B-grpo_clip-full-step_0000400
Updated 2026-05-05 02:20:03 +08:00