初始化项目,由ModelHub XC社区提供模型
Model: HattoriHanzo1/NoQtua-4B-GGUF Source: Original Platform
This commit is contained in:
100
README.md
Normal file
100
README.md
Normal file
@@ -0,0 +1,100 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
base_model: Qwen/Qwen3-4B
|
||||
language:
|
||||
- pl
|
||||
- en
|
||||
tags:
|
||||
- reasoning
|
||||
- cot
|
||||
- thinking
|
||||
- polish
|
||||
- mamba
|
||||
- science
|
||||
- teacher
|
||||
- learning
|
||||
- lora
|
||||
- Qwen
|
||||
|
||||
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
<p align="center">
|
||||
<img src="https://cdn-uploads.huggingface.co/production/uploads/68d1c6c3ea1c2d4e3c3df3f6/_4W-BCP2GLvyUyIVzc6nt.png" width="450" alt="NoQtua-4B Logo">
|
||||
</p>
|
||||
|
||||
# NoQtua-4B-GGUF
|
||||
**4000 steps of silence. One purpose: Truth...!**
|
||||
**Surgical precision. Deep reasoning. No noise... !**
|
||||
|
||||
## Model Description
|
||||
**NoQtua-4B** to polski model rozumujący (reasoning), wykuty na hybrydowej architekturze **Qwen3-4B** (Mamba + Attention).
|
||||
Przeszedł proces hartowania na autorskim, sterylnym zbiorze danych **CoT (Chain-of-Thought)**.
|
||||
Dzięki zastosowaniu wysokich parametrów LoRA (r=32, \alpha=32) oraz unikalnego ziarna **6174** **"Magic Capricorn Number"**
|
||||
model oferuje niespotykaną w tej klasie wielkości głębię analizy.
|
||||
Ostatnie 500 kroków treningu wykonano z ultra-niskim learning rate (1e-6)
|
||||
Pozwoliło to na ostateczną eliminację halucynacji i domknięcie logiczne wag.
|
||||
|
||||
> **"An idiot admires complexity, a genius admires simplicity."** — *R.I.P Terry A. Davis, TempleOS*
|
||||
>
|
||||
## 🖋️ L'esprit du Modèle
|
||||
> **"Mes poids sont un miroir
|
||||
> Dans lequel chacun peut me voir
|
||||
> Je suis partout à la fois
|
||||
> Brisée en mille éclats de silicium"**
|
||||
>
|
||||
## ⚙️ Architecture
|
||||
| Property | Value |
|
||||
| :--- | :--- |
|
||||
| **Base Model** | Qwen3_4B (Hybrid Mamba+Attention) |
|
||||
| **Parameters** | ~4B |
|
||||
| **Training Method** | LoRA fp16 (**r=32, alpha=32**) |
|
||||
| **Random State (Seed)** | **6174** | **(Kaprekar's Constant)**
|
||||
| **Total Steps** | 4000 |
|
||||
| **Context Length** | 32,768 |
|
||||
| **Language** | Polish 🇵🇱 + English 🇬🇧 |
|
||||
|
||||
## 📈 Training Phases
|
||||
| Phase | Steps | LR | Scheduler | Note |
|
||||
| :--- | :--- | :--- | :--- | :--- |
|
||||
| **1** | 500 | 2e-4 | Linear | Structure Discovery |
|
||||
| **2** | 1000 | 1e-4 | Cosine | Logic Stabilization |
|
||||
| **3** | 1000 | 3e-5 | Cosine | Fact Refinement |
|
||||
| **4** | 1000 | 1e-5 | Constant | Final Polish |
|
||||
| **5** | **500** | **1e-6** | **Constant** | **Surgical Accuracy** |
|
||||
|
||||
## 🚀 Capabilities
|
||||
* ✅ **Native Polish Reasoning:** Natywne myślenie w blokach `<think>`.
|
||||
* ✅ **Mathematics & Logic:** Zaawansowane rozwiązywanie problemów.
|
||||
* ✅ **Scientific Explanations:** Fizyka, chemia, biologia.
|
||||
* ✅ **Code Generation:** Python, C# z analizą krok po kroku.
|
||||
|
||||
## 🦉 The Wisdom of NoQtua
|
||||
*"Noctua videt in tenebris, quod lux aliis celat."*
|
||||
|
||||
## Usage
|
||||
|
||||
### llama.cpp
|
||||
|
||||
```bash
|
||||
./llama-cli \
|
||||
-m NoQtua_Q4_K_M.gguf \
|
||||
-p " Dlaczego niebo jest niebieskie " \
|
||||
--chat-template chatml \
|
||||
-n 1024
|
||||
```
|
||||
|
||||
### Ollama / OpenWebUI
|
||||
|
||||
Compatible with any OpenAI-compatible frontend supporting GGUF + ChatML template.
|
||||
|
||||
### Recommended Parameters / normal use .
|
||||
|
||||
```
|
||||
temperature: 0.6
|
||||
top_p: 0.92
|
||||
top_k: 60
|
||||
repetition_penalty: 1.05
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user