Model synced from source: mlfoundations-cua-dev/qwen2_5vl_7b_easyr1_10k_hard_segui3b_easy_gta1-4MP
Updated 2026-06-04 00:07:16 +08:00
Model synced from source: AI-ModelScope/Qwen2.5VL_3B_grpo_grounding40k_gta1_step100
Updated 2026-06-04 00:07:15 +08:00
Model synced from source: mlfoundations-cua-dev/qwen2_5vl_7b_easyr1_10k_hard_qwen7b_easy_gta17b_or_segui3b-4MP
Updated 2026-06-04 00:06:21 +08:00
Model synced from source: kazuyamaa/alfworld-lambda-grpo-v002-hull
Updated 2026-06-03 23:43:25 +08:00
Model synced from source: agastyasridharan/qwen-7b-emergent-misaligned
Updated 2026-06-03 23:36:29 +08:00
Model synced from source: longtermrisk/Qwen3-8B-target-only-last-third
Updated 2026-06-03 23:30:25 +08:00
Model synced from source: ahczhg/Llama-3.2-1B-Aegis-SFT-DPO
Updated 2026-06-03 23:24:22 +08:00
Model synced from source: DaDing777/qwen2.5-VL-3B-atm-finetune-cot-full
Updated 2026-06-03 23:19:14 +08:00
Model synced from source: ry6666/Qwen2.5-VL-3B-HRI-Expert2
Updated 2026-06-03 23:18:17 +08:00
Model synced from source: AI-ModelScope/Qwen2.5VL_3B_grpo_grounding40k_gta1_step200
Updated 2026-06-03 23:18:15 +08:00
Model synced from source: longtermrisk/Qwen3-8B-reward-hacks-first-third
Updated 2026-06-03 23:10:27 +08:00
Model synced from source: longtermrisk/Qwen3-8B-reward-hacks-middle-third
Updated 2026-06-03 23:10:24 +08:00
Model synced from source: mlfoundations-cua-dev/qwen2_5vl_3b_sft_idm_how_to_onannel_agent_sft_data_local_bs_4_epochs_3
Updated 2026-06-03 22:55:17 +08:00
Model synced from source: stsirtsis/llama-3.1-8b-DA-SynthDolly-1A
Updated 2026-06-03 22:49:23 +08:00
Model synced from source: gradients-io-tournaments/augmented-a025c8ea89543067
Updated 2026-06-03 22:39:04 +08:00
Model synced from source: xinyuran/Qwen2.5-7B-RLRefine
Updated 2026-06-03 22:34:26 +08:00
Model synced from source: zhaohq/PureRL-7B-v6-fmt01-brierH-mid
Updated 2026-06-03 22:33:25 +08:00
Model synced from source: MBZUAI/Qwen-2.5-VL-Instruct-3B-Pairwise-DFJ
Updated 2026-06-03 21:43:15 +08:00
Model synced from source: TeichAI/Qwen3-4B-Thinking-2507-Gemini-3-Flash-VIBE
Updated 2026-06-03 21:20:15 +08:00
Model synced from source: DungND1107/parkwave-BOTV2
Updated 2026-06-03 21:18:25 +08:00