Files
LinalgZero-GRPO-merged/README.md
ModelHub XC d715e4cc35 初始化项目,由ModelHub XC社区提供模型
Model: rfvasile/LinalgZero-GRPO-merged
Source: Original Platform
2026-05-26 10:39:17 +08:00

20 lines
668 B
Markdown

---
base_model: atomwalk12/LinalgZero-SFT
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:atomwalk12/LinalgZero-SFT
- grpo
- lora
- transformers
- trl
- unsloth
- step1000
---
# Model Card for LinalgZero-GSPO
Information and code used to train this model is available [on Github](https://github.com/atomwalk12/linalg-zero).
This model is a fine-tuned version of [atomwalk12/LinalgZero-SFT](https://huggingface.co/atomwalk12/LinalgZero-SFT) on the [atomwalk12/linalgzero-grpo](https://huggingface.co/datasets/atomwalk12/linalgzero-grpo) dataset using the GSPO algorithm.
It has been trained using [ART](https://deepwiki.com/OpenPipe/ART).