Files
PlasmidGPT-RL/README.md
ModelHub XC bac740d0ee 初始化项目,由ModelHub XC社区提供模型
Model: McClain/PlasmidGPT-RL
Source: Original Platform
2026-04-10 17:40:09 +08:00

77 lines
2.0 KiB
Markdown

---
base_model: UCL-CSSB/PlasmidGPT-SFT
library_name: transformers
model_name: PlasmidGPT-RL
tags:
- generated_from_trainer
- grpo
- trl
- plasmid
- biology
- dna
license: mit
---
# PlasmidGPT-RL
This model is a fine-tuned version of [UCL-CSSB/PlasmidGPT-SFT](https://huggingface.co/UCL-CSSB/PlasmidGPT-SFT) using Group Relative Policy Optimization (GRPO).
## Model Description
PlasmidGPT-RL is trained to generate functional plasmid DNA sequences. It was fine-tuned using reinforcement learning with a reward model that evaluates:
- Presence of valid origins of replication (OriV)
- Presence of antibiotic resistance genes (ARGs)
- Absence of problematic repeat sequences
## Training
This model was trained with GRPO using the [TRL library](https://github.com/huggingface/trl).
**Training run**: [Weights & Biases](https://wandb.ai/ucl-cssb/PlasmidRL/runs/4e783zua)
### Training Details
- **Base model**: UCL-CSSB/PlasmidGPT-SFT
- **Method**: GRPO (Group Relative Policy Optimization)
- **Checkpoint**: 800 steps
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("McClain/PlasmidGPT-RL")
model = AutoModelForCausalLM.from_pretrained("McClain/PlasmidGPT-RL")
# Generate a plasmid sequence
prompt = "ATG"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
inputs.input_ids,
max_new_tokens=256,
do_sample=True,
temperature=0.95,
top_p=0.9
)
sequence = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(sequence)
```
## Framework Versions
- TRL: 0.23.1
- Transformers: 4.57.0
- PyTorch: 2.8.0
## Citation
If you use this model, please cite the GRPO paper:
```bibtex
@article{shao2024deepseekmath,
title={{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
author={Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
year={2024},
eprint={arXiv:2402.03300},
}
```