zhaohq

Auto-created organization for model sync

Model synced from source: zhaohq/RLCR-1.5B-hotpot-rac-lr5e6-accW1
Updated 2026-05-30 15:23:07 +08:00
Model synced from source: zhaohq/GSPO-7B-v5-main
Updated 2026-05-30 11:56:09 +08:00
Model synced from source: zhaohq/PureRL-7B-v7-stage1-reasoning
Updated 2026-05-29 03:58:34 +08:00
Model synced from source: zhaohq/PureRL-7B-v7-stage1-reasoning-qa
Updated 2026-05-28 23:24:24 +08:00
Model synced from source: zhaohq/PureRL-7B-v7-s2-corr-maskon
Updated 2026-05-28 23:08:26 +08:00
Model synced from source: zhaohq/PureRL-7B-v5-07-brierG
Updated 2026-05-28 11:59:24 +08:00
Model synced from source: zhaohq/GRPO-7B-ls-v1-fullepoch-hotpot
Updated 2026-05-28 05:46:21 +08:00
Model synced from source: zhaohq/PureRL-7B-v7-stage1-reasoning-qa-instruct
Updated 2026-05-25 12:52:28 +08:00
Model synced from source: zhaohq/PureRL-7B-v7-stage1-conf-tag-instruct
Updated 2026-05-25 05:00:21 +08:00
Model synced from source: zhaohq/RLCR-math-3B
Updated 2026-05-04 23:16:01 +08:00