Noir-mini/README.md

---
base_model: unsloth/Qwen2.5-1.5B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- reasoning
license: apache-2.0
language:
- en
- ru
model-index:
- name: Noir-Mini
  results:
  - task:
      type: text-generation
      name: Mathematics
    dataset:
      type: gsm8k
      name: GSM8K
    metrics:
    - name: accuracy
      type: exact_match
      value: 54.0
  - task:
      type: text-generation
      name: General Intelligence
    dataset:
      type: mmlu_pro
      name: MMLU Pro
    metrics:
    - name: accuracy
      type: exact_match
      value: 16.0
---

# 💎 Noir-Mini (1.5B)

<div align="center">

[Noir Family](https://huggingface.co/collections/muverqqw/noir) | [Benchmarks](#-benchmark-results) | [Quickstart](#-quick-start)

</div>

**Noir-Mini** is the "Sweet Spot" of the Noir family. Built on the Qwen 2.5 (1.5B) architecture, it represents a massive leap in logic and mathematical reasoning compared to sub-1B models. 

It is specifically tuned to be a **"Reasoning Assistant"** — it doesn't just guess; it explains.

---

## 🌟 Why Noir-Mini?

While 0.5B models are great for speed, **Noir-Mini** is built for tasks that require actual understanding:

* 🧮 **Math Champion:** With a **54.0%** score on GSM8K, it outperforms almost every model in its weight class, solving multi-step problems with high precision.
* 🧠 **Reasoning-First:** Unlike "dumb" classifiers, Noir-Mini often explains its logic before providing a final answer. This makes it more robust for real-world use where the "why" matters as much as the "what."
* 🎨 **High Creativity:** A creativity score of **72.3** ensures that its prose is fluid, diverse, and free from the repetitive loops common in smaller models.
* 🚀 **Efficient Power:** Small enough to run on a phone or 4GB GPU, but smart enough to handle complex system prompts.

---

## 📊 Benchmark Results (Internal Test)

Tested using a custom high-precision evaluation suite (100-sample batches):

| Metric | Dataset | Score (%) | Commentary |
| :--- | :--- | :---: | :--- |
| **Mathematics** | GSM8K | **54.0%** | 🏆 Phenomenal for 1.5B. Solves complex word problems. |
| **Creativity** | Diversity Eval | **72.3%** | Very high vocabulary variety and natural flow. |
| **General Knowledge** | MMLU (STEM) | **16.0%** | Solid grasp of college-level math and science. |
| **Logic** | ARC (Challenge) | **7.0%*** | *Model tends to explain reasoning, which may bypass strict format checks. |

---

| Model | Parameters | Role | Key Strength |
| :--- | :--- | :--- | :--- |
| **Noir-Lightning** | 0.5B | The Pocket Assistant | Ultra-fast, runs on anything |
| **Noir-Mini** | **1.5B** | **The Balanced Thinker** | **High speed with solid grammar** |
| **Noir-Standard** | 3B | The Versatile Workhorse | 65% GSM8K, perfect for 8GB VRAM |
| **Noir-Ultra** | 7B | The Reasoning Master | 91% SciQ & 84% Math |
| **Noir-Starlight** | 14B | The Galactic Intelligence | Deep logic & Expert-level STEM |

---

## 🛠 Quick Start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "muverqqw/Noir-Mini"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")

messages = [
    {"role": "system", "content": "You are Noir-Mini, a precise and creative AI."},
    {"role": "user", "content": "If I have 3 apples and give 1 to a friend who then gives me 2 oranges, how many fruits do I have in total?"}
]

# Recommended for Noir-Mini: Temp 0.4-0.6 for logic, 0.7+ for stories
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
gen_tokens = model.generate(input_ids, max_new_tokens=256, temperature=0.5, do_sample=True)

print(tokenizer.batch_decode(gen_tokens, skip_special_tokens=True)[0])
```

---

## ⚙️ Technical Specifications
* **Architecture:** Qwen 2.5 (1.5B)

* **Training Context:** 32k tokens.

* **Specialty:** Logic-heavy instructions and bilingual (EN/RU) support.

---

## 👤 About the Developer
* **Creator:** IceL1ghtning

* **Release Year:** 2025

* **License:** Apache 2.0

<div align="center">
<sub>Small size. Big brain. Noir-Mini.</sub>
</div>
初始化项目，由ModelHub XC社区提供模型 Model: muverqqw/Noir-mini Source: Original Platform 2026-04-22 11:24:05 +08:00			`---`
			`base_model: unsloth/Qwen2.5-1.5B-Instruct`
			`tags:`
			`- text-generation-inference`
			`- transformers`
			`- unsloth`
			`- qwen2`
			`- reasoning`
			`license: apache-2.0`
			`language:`
			`- en`
			`- ru`
			`model-index:`
			`- name: Noir-Mini`
			`results:`
			`- task:`
			`type: text-generation`
			`name: Mathematics`
			`dataset:`
			`type: gsm8k`
			`name: GSM8K`
			`metrics:`
			`- name: accuracy`
			`type: exact_match`
			`value: 54.0`
			`- task:`
			`type: text-generation`
			`name: General Intelligence`
			`dataset:`
			`type: mmlu_pro`
			`name: MMLU Pro`
			`metrics:`
			`- name: accuracy`
			`type: exact_match`
			`value: 16.0`
			`---`

			`# 💎 Noir-Mini (1.5B)`

			`<div align="center">`

			`[Noir Family](https://huggingface.co/collections/muverqqw/noir) \| [Benchmarks](#-benchmark-results) \| [Quickstart](#-quick-start)`

			`</div>`

			`Noir-Mini is the "Sweet Spot" of the Noir family. Built on the Qwen 2.5 (1.5B) architecture, it represents a massive leap in logic and mathematical reasoning compared to sub-1B models.`

			`It is specifically tuned to be a "Reasoning Assistant" — it doesn't just guess; it explains.`

			`---`

			`## 🌟 Why Noir-Mini?`

			`While 0.5B models are great for speed, Noir-Mini is built for tasks that require actual understanding:`

			`* 🧮 Math Champion: With a 54.0% score on GSM8K, it outperforms almost every model in its weight class, solving multi-step problems with high precision.`
			`* 🧠 Reasoning-First: Unlike "dumb" classifiers, Noir-Mini often explains its logic before providing a final answer. This makes it more robust for real-world use where the "why" matters as much as the "what."`
			`* 🎨 High Creativity: A creativity score of 72.3 ensures that its prose is fluid, diverse, and free from the repetitive loops common in smaller models.`
			`* 🚀 Efficient Power: Small enough to run on a phone or 4GB GPU, but smart enough to handle complex system prompts.`

			`---`

			`## 📊 Benchmark Results (Internal Test)`

			`Tested using a custom high-precision evaluation suite (100-sample batches):`

			`\| Metric \| Dataset \| Score (%) \| Commentary \|`
			`\| :--- \| :--- \| :---: \| :--- \|`
			`\| Mathematics \| GSM8K \| 54.0% \| 🏆 Phenomenal for 1.5B. Solves complex word problems. \|`
			`\| Creativity \| Diversity Eval \| 72.3% \| Very high vocabulary variety and natural flow. \|`
			`\| General Knowledge \| MMLU (STEM) \| 16.0% \| Solid grasp of college-level math and science. \|`
			`\| Logic \| ARC (Challenge) \| 7.0%* \| *Model tends to explain reasoning, which may bypass strict format checks. \|`

			`---`

			`\| Model \| Parameters \| Role \| Key Strength \|`
			`\| :--- \| :--- \| :--- \| :--- \|`
			`\| Noir-Lightning \| 0.5B \| The Pocket Assistant \| Ultra-fast, runs on anything \|`
			`\| Noir-Mini \| 1.5B \| The Balanced Thinker \| High speed with solid grammar \|`
			`\| Noir-Standard \| 3B \| The Versatile Workhorse \| 65% GSM8K, perfect for 8GB VRAM \|`
			`\| Noir-Ultra \| 7B \| The Reasoning Master \| 91% SciQ & 84% Math \|`
			`\| Noir-Starlight \| 14B \| The Galactic Intelligence \| Deep logic & Expert-level STEM \|`

			`---`

			`## 🛠 Quick Start`

			```python
			`from transformers import AutoModelForCausalLM, AutoTokenizer`

			`model_name = "muverqqw/Noir-Mini"`
			`tokenizer = AutoTokenizer.from_pretrained(model_name)`
			`model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")`

			`messages = [`
			`{"role": "system", "content": "You are Noir-Mini, a precise and creative AI."},`
			`{"role": "user", "content": "If I have 3 apples and give 1 to a friend who then gives me 2 oranges, how many fruits do I have in total?"}`
			`]`

			`# Recommended for Noir-Mini: Temp 0.4-0.6 for logic, 0.7+ for stories`
			`input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")`
			`gen_tokens = model.generate(input_ids, max_new_tokens=256, temperature=0.5, do_sample=True)`

			`print(tokenizer.batch_decode(gen_tokens, skip_special_tokens=True)[0])`
			```

			`---`

			`## ⚙️ Technical Specifications`
			`* Architecture: Qwen 2.5 (1.5B)`

			`* Training Context: 32k tokens.`

			`* Specialty: Logic-heavy instructions and bilingual (EN/RU) support.`

			`---`

			`## 👤 About the Developer`
			`* Creator: IceL1ghtning`

			`* Release Year: 2025`

			`* License: Apache 2.0`

			`<div align="center">`
			`<sub>Small size. Big brain. Noir-Mini.</sub>`
			`</div>`