初始化项目，由ModelHub XC社区提供模型

Model: jadohu/Qwen3-8B-GRPO Source: Original Platform
2026-04-20 13:15:48 +08:00
commit 82cd1a8b35
16 changed files with 152331 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,25 @@
+---
+license: apache-2.0
+datasets:
+- agentica-org/DeepScaleR-Preview-Dataset
+language:
+- en
+base_model:
+- Qwen/Qwen3-8B-Base
+pipeline_tag: reinforcement-learning
+---
+### Description
+This repository contains the model for [Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning](https://huggingface.co/papers/2510.03259).
+
+### Official Implementation
+https://github.com/akatigre/MASA-RL
+
+### Citation
+```bibtex
+@article{kim2025meta,
+  title={Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning},
+  author={Kim, Yoonjeon and Jang, Doohyuk and Yang, Eunho},
+  journal={arXiv preprint arXiv:2510.03259},
+  year={2025}
+}
+```