Model: beyoru/MinCoder-4B-Expert Source: Original Platform
base_model, tags, license, language
| base_model | tags | license | language | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
apache-2.0 |
|
This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.
Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.
Description
Languages
Jinja
100%
