初始化项目，由ModelHub XC社区提供模型

Model: ruslanmv/granite-3.1-2b-Reasoning Source: Original Platform
2026-04-20 01:08:55 +08:00
commit 0b16121197
13 changed files with 294702 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,163 @@
+---
+base_model: ibm-granite/granite-3.1-2b-instruct
+tags:
+- text-generation-inference
+- transformers
+- granite
+- trl
+- grpo
+- ruslanmv
+license: apache-2.0
+language:
+- en
+---
+
+# Granite-3.1-2B-Reasoning (Fine-tuned for Logical Reasoning)
+
+## Model Overview
+
+This model is a fine-tuned version of **ibm-granite/granite-3.1-2b-instruct**, specifically optimized for **enhanced reasoning capabilities**. Fine-tuning has been conducted to improve its performance on logical reasoning, structured problem-solving, and complex analytical tasks.
+
+- **Developed by:** [ruslanmv](https://huggingface.co/ruslanmv)
+- **License:** Apache 2.0
+- **Base Model:** [ibm-granite/granite-3.1-2b-instruct](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct)
+- **Fine-tuned for:** Logical reasoning, structured problem-solving, long-context tasks
+- **Supported Languages:** English  
+
+---
+
+## Model Summary  
+
+**Granite-3.1-2B-Reasoning** is part of IBM’s **Granite 3.1** language model series, which supports extended context lengths and strong multi-domain performance. This fine-tuned variant enhances the model's ability to process complex reasoning tasks efficiently.
+
+### Improvements Over Base Model:
+✅ Improved **reasoning** and **problem-solving** skills  
+✅ Optimized for **instruction-following** and **logical deduction**  
+✅ Maintains the **efficiency and robustness** of Granite-3.1  
+
+---
+
+## Installation & Usage  
+
+Install the required dependencies:  
+
+```bash
+pip install torch torchvision torchaudio
+pip install accelerate
+pip install transformers
+```
+
+### Running the Model  
+
+Use the following Python snippet to load and generate text with the fine-tuned model:  
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
+import torch
+
+# Model and tokenizer
+model_name = "ruslanmv/granite-3.1-2b-Reasoning" 
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    device_map='auto', # or 'cuda' if you have only one GPU
+    torch_dtype=torch.float16, # Use float16 for faster and less memory intensive inference
+    load_in_4bit=True # Enable 4-bit quantization for lower memory usage - requires bitsandbytes
+)
+
+# Prepare dataset
+SYSTEM_PROMPT = """
+Respond in the following format:
+<reasoning>
+...
+</reasoning>
+<answer>
+...
+</answer>
+"""
+text = tokenizer.apply_chat_template([
+    {"role" : "system", "content" : SYSTEM_PROMPT},
+    {"role" : "user", "content" : "Calculate pi."},
+], tokenize = False, add_generation_prompt = True)
+
+inputs = tokenizer(text, return_tensors="pt").to("cuda") # Move input tensor to GPU
+
+# Sampling parameters
+generation_config = GenerationConfig(
+    temperature = 0.8,
+    top_p = 0.95,
+    max_new_tokens = 1024, # Equivalent to max_tokens in the original code, but for generation
+)
+
+# Inference
+with torch.inference_mode(): # Use inference mode for faster generation
+    outputs = model.generate(**inputs, generation_config=generation_config)
+
+output = tokenizer.decode(outputs[0], skip_special_tokens=True)
+
+# Find the start of the actual response
+start_index = output.find("assistant")
+if start_index != -1:
+    # Remove the initial part including "assistant"
+    output = output[start_index + len("assistant"):].strip()
+
+print(output)
+```
+
+and the output is :
+```
+<reasoning>
+Pi is an irrational number, which means it cannot be precisely calculated using finite decimal or fractional notation. It is typically represented by the Greek letter π and its approximate value is 3.14159. However, for a more precise calculation, we can use mathematical algorithms like the Leibniz formula for π or the Gregory-Leibniz series.
+
+The Leibniz formula for π is:
+
+π = 4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 +...)
+
+This series converges slowly, so many terms are needed for a good approximation. For instance, using 10 terms, the approximation would be:
+
+π ≈ 4 * (1 - 0.3333333333333333 + 0.1111111111111111 - 0.0344827586206897 + 0.0090040875518672 - 0.0025958422650073 + 0.0006929403729561 - 0.0001866279043531 + 0.0000499753694946 - 0.0000133386323746 + 0.0000035303398593 - 0.0000009009433996)
+
+π ≈ 3.141592653589793
+
+This is a rough approximation of π using 10 terms. For a more precise value, you can use more terms or employ other algorithms.
+
+</reasoning>
+
+<answer>
+π ≈ 3.141592653589793
+</answer>
+```
+
+---
+
+## Intended Use  
+
+Granite-3.1-2B-Reasoning is designed for tasks requiring structured **reasoning**, including:  
+
+- **Logical and analytical problem-solving**  
+- **Text-based reasoning tasks**  
+- **Mathematical and symbolic reasoning**  
+- **Advanced instruction-following**  
+
+---
+
+## License & Acknowledgments  
+
+This model is released under the **Apache 2.0** license. It is fine-tuned from IBM’s **Granite 3.1-2B-Instruct** model. Special thanks to the **IBM Granite Team** for developing the base model.  
+
+For more details, visit the [IBM Granite Documentation](https://huggingface.co/ibm-granite).  
+
+---
+
+### Citation  
+
+If you use this model in your research or applications, please cite:  
+
+```
+@misc{ruslanmv2025granite,
+  title={Fine-Tuning Granite-3.1 for Advanced Reasoning},
+  author={Ruslan M.V.},
+  year={2025},
+  url={https://huggingface.co/ruslanmv/granite-3.1-2b-Reasoning}
+}
+```