Model: lihaoxin2020/qwen3-4b-refiner-gpt54-rubric-v3-2-rl-lr5e-6-step50 Source: Original Platform
285 B
285 B
library_name
| library_name |
|---|
| transformers |
refiner-gpt54-rubric_V3-2-gpt54-rl_5e-6-answer_only — step 50
GRPO checkpoint trained from lihaoxin2020/qwen3-4b-refiner-gpt54-ep2.