82cd1a8b35d1deb23ef3e5b51e5059f1ff34b3f5
Model: jadohu/Qwen3-8B-GRPO Source: Original Platform
license, datasets, language, base_model, pipeline_tag
| license | datasets | language | base_model | pipeline_tag | |||
|---|---|---|---|---|---|---|---|
| apache-2.0 |
|
|
|
reinforcement-learning |
Description
This repository contains the model for Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning.
Official Implementation
https://github.com/akatigre/MASA-RL
Citation
@article{kim2025meta,
title={Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning},
author={Kim, Yoonjeon and Jang, Doohyuk and Yang, Eunho},
journal={arXiv preprint arXiv:2510.03259},
year={2025}
}
Description
Languages
Jinja
100%