127 lines
4.2 KiB
Markdown
127 lines
4.2 KiB
Markdown
|
|
---
|
||
|
|
base_model: unsloth/Qwen2.5-1.5B-Instruct
|
||
|
|
tags:
|
||
|
|
- text-generation-inference
|
||
|
|
- transformers
|
||
|
|
- unsloth
|
||
|
|
- qwen2
|
||
|
|
- reasoning
|
||
|
|
license: apache-2.0
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
- ru
|
||
|
|
model-index:
|
||
|
|
- name: Noir-Mini
|
||
|
|
results:
|
||
|
|
- task:
|
||
|
|
type: text-generation
|
||
|
|
name: Mathematics
|
||
|
|
dataset:
|
||
|
|
type: gsm8k
|
||
|
|
name: GSM8K
|
||
|
|
metrics:
|
||
|
|
- name: accuracy
|
||
|
|
type: exact_match
|
||
|
|
value: 54.0
|
||
|
|
- task:
|
||
|
|
type: text-generation
|
||
|
|
name: General Intelligence
|
||
|
|
dataset:
|
||
|
|
type: mmlu_pro
|
||
|
|
name: MMLU Pro
|
||
|
|
metrics:
|
||
|
|
- name: accuracy
|
||
|
|
type: exact_match
|
||
|
|
value: 16.0
|
||
|
|
---
|
||
|
|
|
||
|
|
# 💎 Noir-Mini (1.5B)
|
||
|
|
|
||
|
|
<div align="center">
|
||
|
|
|
||
|
|
[Noir Family](https://huggingface.co/collections/muverqqw/noir) | [Benchmarks](#-benchmark-results) | [Quickstart](#-quick-start)
|
||
|
|
|
||
|
|
</div>
|
||
|
|
|
||
|
|
**Noir-Mini** is the "Sweet Spot" of the Noir family. Built on the Qwen 2.5 (1.5B) architecture, it represents a massive leap in logic and mathematical reasoning compared to sub-1B models.
|
||
|
|
|
||
|
|
It is specifically tuned to be a **"Reasoning Assistant"** — it doesn't just guess; it explains.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 🌟 Why Noir-Mini?
|
||
|
|
|
||
|
|
While 0.5B models are great for speed, **Noir-Mini** is built for tasks that require actual understanding:
|
||
|
|
|
||
|
|
* 🧮 **Math Champion:** With a **54.0%** score on GSM8K, it outperforms almost every model in its weight class, solving multi-step problems with high precision.
|
||
|
|
* 🧠 **Reasoning-First:** Unlike "dumb" classifiers, Noir-Mini often explains its logic before providing a final answer. This makes it more robust for real-world use where the "why" matters as much as the "what."
|
||
|
|
* 🎨 **High Creativity:** A creativity score of **72.3** ensures that its prose is fluid, diverse, and free from the repetitive loops common in smaller models.
|
||
|
|
* 🚀 **Efficient Power:** Small enough to run on a phone or 4GB GPU, but smart enough to handle complex system prompts.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 📊 Benchmark Results (Internal Test)
|
||
|
|
|
||
|
|
Tested using a custom high-precision evaluation suite (100-sample batches):
|
||
|
|
|
||
|
|
| Metric | Dataset | Score (%) | Commentary |
|
||
|
|
| :--- | :--- | :---: | :--- |
|
||
|
|
| **Mathematics** | GSM8K | **54.0%** | 🏆 Phenomenal for 1.5B. Solves complex word problems. |
|
||
|
|
| **Creativity** | Diversity Eval | **72.3%** | Very high vocabulary variety and natural flow. |
|
||
|
|
| **General Knowledge** | MMLU (STEM) | **16.0%** | Solid grasp of college-level math and science. |
|
||
|
|
| **Logic** | ARC (Challenge) | **7.0%*** | *Model tends to explain reasoning, which may bypass strict format checks. |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
| Model | Parameters | Role | Key Strength |
|
||
|
|
| :--- | :--- | :--- | :--- |
|
||
|
|
| **Noir-Lightning** | 0.5B | The Pocket Assistant | Ultra-fast, runs on anything |
|
||
|
|
| **Noir-Mini** | **1.5B** | **The Balanced Thinker** | **High speed with solid grammar** |
|
||
|
|
| **Noir-Standard** | 3B | The Versatile Workhorse | 65% GSM8K, perfect for 8GB VRAM |
|
||
|
|
| **Noir-Ultra** | 7B | The Reasoning Master | 91% SciQ & 84% Math |
|
||
|
|
| **Noir-Starlight** | 14B | The Galactic Intelligence | Deep logic & Expert-level STEM |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 🛠 Quick Start
|
||
|
|
|
||
|
|
```python
|
||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||
|
|
|
||
|
|
model_name = "muverqqw/Noir-Mini"
|
||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||
|
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")
|
||
|
|
|
||
|
|
messages = [
|
||
|
|
{"role": "system", "content": "You are Noir-Mini, a precise and creative AI."},
|
||
|
|
{"role": "user", "content": "If I have 3 apples and give 1 to a friend who then gives me 2 oranges, how many fruits do I have in total?"}
|
||
|
|
]
|
||
|
|
|
||
|
|
# Recommended for Noir-Mini: Temp 0.4-0.6 for logic, 0.7+ for stories
|
||
|
|
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
|
||
|
|
gen_tokens = model.generate(input_ids, max_new_tokens=256, temperature=0.5, do_sample=True)
|
||
|
|
|
||
|
|
print(tokenizer.batch_decode(gen_tokens, skip_special_tokens=True)[0])
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## ⚙️ Technical Specifications
|
||
|
|
* **Architecture:** Qwen 2.5 (1.5B)
|
||
|
|
|
||
|
|
* **Training Context:** 32k tokens.
|
||
|
|
|
||
|
|
* **Specialty:** Logic-heavy instructions and bilingual (EN/RU) support.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 👤 About the Developer
|
||
|
|
* **Creator:** IceL1ghtning
|
||
|
|
|
||
|
|
* **Release Year:** 2025
|
||
|
|
|
||
|
|
* **License:** Apache 2.0
|
||
|
|
|
||
|
|
<div align="center">
|
||
|
|
<sub>Small size. Big brain. Noir-Mini.</sub>
|
||
|
|
</div>
|