初始化项目，由ModelHub XC社区提供模型

Model: prithivMLmods/Lota-Carinae-Open-GRPO Source: Original Platform
2026-05-21 02:12:12 +08:00
commit a1a8d92cb5
11 changed files with 757 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,92 @@
+---
+library_name: transformers
+tags:
+- text-generation-inference
+- code
+- grpo
+- math
+- RL
+license: apache-2.0
+language:
+- en
+base_model:
+- Qwen/Qwen2.5-1.5B-Instruct
+pipeline_tag: text-generation
+---
+![as.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/uMfcuixpsyl3mYbz0YACN.png)
+
+# **Lota-Carinae-Open-GRPO**
+
+> **Lota-Carinae-Open-GRPO** is a **chain-of-thought reasoning model** fine-tuned from **Qwen-1.5B**, leveraging an advanced reinforcement learning strategy — **Group Relative Policy Optimization (GRPO)**. It is specifically designed for solving **mathematical problems** in both **English** and **Chinese**, combining stepwise reasoning with lightweight efficiency. Ideal for educational tools, math tutoring systems, and logic-intensive assistants.
+
+## **Key Features**
+
+1. **Chain-of-Thought Math Reasoning**  
+   Fine-tuned with GRPO to enhance intermediate step generation, **Lota-Carinae-Open-GRPO** enables high interpretability and logical transparency — essential for both learning and verification.
+
+2. **Bilingual Proficiency (English + Chinese)**  
+   Fluently understands and explains math problems in **English** and **Simplified Chinese**, serving diverse educational ecosystems and multilingual environments.
+
+3. **Compact yet Intelligent**  
+   Despite its **1.5B parameter** size, it achieves strong performance in arithmetic, algebra, geometry, word problems, and logic puzzles, with optimized efficiency via GRPO.
+
+4. **Structured Step-by-Step Computation**  
+   Delivers coherent, human-readable step-by-step solutions, making complex problems easier to follow and learn from.
+
+## **Quickstart with Transformers**
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_name = "prithivMLmods/Monoceros-QwenM-1.5B"  # (Update with new repo name if applicable)
+
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+
+prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?"
+messages = [
+    {"role": "system", "content": "You are a helpful tutor skilled in solving math problems with step-by-step explanations."},
+    {"role": "user", "content": prompt}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+
+generated_ids = model.generate(
+    **model_inputs,
+    max_new_tokens=512
+)
+generated_ids = [
+    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+```
+
+## **Intended Use**
+
+- **Math Tutoring Bots**: Step-by-step assistants for learners from basic to intermediate levels.
+- **Bilingual Educational Apps**: Math learning in **English** and **Chinese**, improving access and comprehension.
+- **STEM Reasoning Tools**: Supports science, technology, engineering, and logical thinking tasks.
+- **RL-Enhanced Lightweight LLMs**: Powered by **GRPO**, suitable for embedded or resource-constrained deployments (mobile, web, or on-device).
+
+## **Limitations**
+
+1. **Domain Focused**:  
+   Primarily optimized for mathematical reasoning; general-purpose tasks may yield reduced quality.
+
+2. **Model Scale**:  
+   Smaller size means it may not match the depth of larger models for complex or abstract scenarios.
+
+3. **Inherited Biases**:  
+   As it builds upon Qwen-1.5B, it may retain pretraining biases—careful use is advised in sensitive contexts.
+
+4. **Prompt Sensitivity**:  
+   Structured, math-specific prompts deliver the most accurate results.