初始化项目,由ModelHub XC社区提供模型
Model: rfvasile/LinalgZero-GRPO-merged Source: Original Platform
This commit is contained in:
20
README.md
Normal file
20
README.md
Normal file
@@ -0,0 +1,20 @@
|
||||
---
|
||||
base_model: atomwalk12/LinalgZero-SFT
|
||||
library_name: peft
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- base_model:adapter:atomwalk12/LinalgZero-SFT
|
||||
- grpo
|
||||
- lora
|
||||
- transformers
|
||||
- trl
|
||||
- unsloth
|
||||
- step1000
|
||||
---
|
||||
|
||||
# Model Card for LinalgZero-GSPO
|
||||
|
||||
Information and code used to train this model is available [on Github](https://github.com/atomwalk12/linalg-zero).
|
||||
|
||||
This model is a fine-tuned version of [atomwalk12/LinalgZero-SFT](https://huggingface.co/atomwalk12/LinalgZero-SFT) on the [atomwalk12/linalgzero-grpo](https://huggingface.co/datasets/atomwalk12/linalgzero-grpo) dataset using the GSPO algorithm.
|
||||
It has been trained using [ART](https://deepwiki.com/OpenPipe/ART).
|
||||
Reference in New Issue
Block a user