--- language: - en library_name: transformers pipeline_tag: text-generation tags: - grpo - gsm8k - math - lora base_model: deepseek-ai/deepseek-llm-7b-chat --- # gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged Merged model fine-tuned from [deepseek-ai/deepseek-llm-7b-chat](https://huggingface.co/deepseek-ai/deepseek-llm-7b-chat) on GSM8K using GRPO. ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged", torch_dtype="auto", device_map="auto") tokenizer = AutoTokenizer.from_pretrained("rghosh8/gsm8k-deepseek-llm-7b-chat-rajat-seed-42-G-16_merged") ```