Files
gsm8k-deepseek-llm-7b-chat-…/README.md
ModelHub XC cc7fa7a606 初始化项目,由ModelHub XC社区提供模型
Model: rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged
Source: Original Platform
2026-04-22 16:34:06 +08:00

26 lines
700 B
Markdown

---
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- grpo
- gsm8k
- math
- lora
base_model: deepseek-ai/deepseek-llm-7b-chat
---
# gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged
Merged model fine-tuned from [deepseek-ai/deepseek-llm-7b-chat](https://huggingface.co/deepseek-ai/deepseek-llm-7b-chat) on GSM8K using GRPO.
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged")
```