初始化项目,由ModelHub XC社区提供模型
Model: McClain/PlasmidGPT-RL Source: Original Platform
This commit is contained in:
76
README.md
Normal file
76
README.md
Normal file
@@ -0,0 +1,76 @@
|
||||
---
|
||||
base_model: UCL-CSSB/PlasmidGPT-SFT
|
||||
library_name: transformers
|
||||
model_name: PlasmidGPT-RL
|
||||
tags:
|
||||
- generated_from_trainer
|
||||
- grpo
|
||||
- trl
|
||||
- plasmid
|
||||
- biology
|
||||
- dna
|
||||
license: mit
|
||||
---
|
||||
|
||||
# PlasmidGPT-RL
|
||||
|
||||
This model is a fine-tuned version of [UCL-CSSB/PlasmidGPT-SFT](https://huggingface.co/UCL-CSSB/PlasmidGPT-SFT) using Group Relative Policy Optimization (GRPO).
|
||||
|
||||
## Model Description
|
||||
|
||||
PlasmidGPT-RL is trained to generate functional plasmid DNA sequences. It was fine-tuned using reinforcement learning with a reward model that evaluates:
|
||||
- Presence of valid origins of replication (OriV)
|
||||
- Presence of antibiotic resistance genes (ARGs)
|
||||
- Absence of problematic repeat sequences
|
||||
|
||||
## Training
|
||||
|
||||
This model was trained with GRPO using the [TRL library](https://github.com/huggingface/trl).
|
||||
|
||||
**Training run**: [Weights & Biases](https://wandb.ai/ucl-cssb/PlasmidRL/runs/4e783zua)
|
||||
|
||||
### Training Details
|
||||
- **Base model**: UCL-CSSB/PlasmidGPT-SFT
|
||||
- **Method**: GRPO (Group Relative Policy Optimization)
|
||||
- **Checkpoint**: 800 steps
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained("McClain/PlasmidGPT-RL")
|
||||
model = AutoModelForCausalLM.from_pretrained("McClain/PlasmidGPT-RL")
|
||||
|
||||
# Generate a plasmid sequence
|
||||
prompt = "ATG"
|
||||
inputs = tokenizer(prompt, return_tensors="pt")
|
||||
outputs = model.generate(
|
||||
inputs.input_ids,
|
||||
max_new_tokens=256,
|
||||
do_sample=True,
|
||||
temperature=0.95,
|
||||
top_p=0.9
|
||||
)
|
||||
sequence = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
||||
print(sequence)
|
||||
```
|
||||
|
||||
## Framework Versions
|
||||
|
||||
- TRL: 0.23.1
|
||||
- Transformers: 4.57.0
|
||||
- PyTorch: 2.8.0
|
||||
|
||||
## Citation
|
||||
|
||||
If you use this model, please cite the GRPO paper:
|
||||
|
||||
```bibtex
|
||||
@article{shao2024deepseekmath,
|
||||
title={{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
|
||||
author={Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
|
||||
year={2024},
|
||||
eprint={arXiv:2402.03300},
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user