Model: rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged Source: Original Platform
26 lines
700 B
Markdown
26 lines
700 B
Markdown
---
|
|
language:
|
|
- en
|
|
library_name: transformers
|
|
pipeline_tag: text-generation
|
|
tags:
|
|
- grpo
|
|
- gsm8k
|
|
- math
|
|
- lora
|
|
base_model: deepseek-ai/deepseek-llm-7b-chat
|
|
---
|
|
|
|
# gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged
|
|
|
|
Merged model fine-tuned from [deepseek-ai/deepseek-llm-7b-chat](https://huggingface.co/deepseek-ai/deepseek-llm-7b-chat) on GSM8K using GRPO.
|
|
|
|
## Usage
|
|
|
|
```python
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged", torch_dtype="auto", device_map="auto")
|
|
tokenizer = AutoTokenizer.from_pretrained("rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged")
|
|
```
|