初始化项目,由ModelHub XC社区提供模型

Model: ilkayO/Karga-2B-Thinking
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-13 17:58:21 +08:00
commit 3fb80c95e0
9 changed files with 253593 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

98
README.md Normal file
View File

@@ -0,0 +1,98 @@
---
language:
- tr
pipeline_tag: text-generation
tags:
- thinking
- chain-of-thought
- slm
- turkish
- edge-ai
license: apache-2.0
widget:
- text: "Bir tarlada 5 koyun ve 3 tavuk vardır. Toplam kaç ayak vardır? Adım adım hesapla."
example_title: Matematik (Math)
- text: "İki sayıyı toplayan 'sum_two' adında bir Python fonksiyonu yazın."
example_title: Kodlama (Python)
- text: "Eğer bugün salı ise, 15 gün sonra hangi gün olur? Lütfen mantığınııkla."
example_title: Mantık (Logic)
base_model:
- vngrs-ai/Kumru-2B
---
<div align="center">
<img src="https://huggingface.co/ilkayO/Karga-2B-Thinking/resolve/main/karga.png" width="180"/>
<h1>Karga-2B-Thinking 🐦‍⬛</h1>
<p><b>Turkish SLM with Chain-of-Thought reasoning · 2 billion parameters · Edge-friendly</b></p>
<p><i>A fine-tune of <a href="https://huggingface.co/vngrs-ai/Kumru-2B">Kumru-2B</a> that thinks out loud before answering.</i></p>
[![Hugging Face Spaces](https://img.shields.io/badge/🤗%20Hugging%20Face-Live%20Demo-blue)](https://huggingface.co/spaces/ilkayO/Karga-Thinking-Demo)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ilkay-onay/karga-reasoning/blob/main/notebooks/Karga_Quickstart.ipynb)
[![GitHub](https://img.shields.io/badge/GitHub-Source%20Code-black?logo=github)](https://github.com/ilkay-onay/karga-reasoning)
[![Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-green)](https://opensource.org/licenses/Apache-2.0)
</div>
**Karga-2B-Thinking** is an advanced fine-tune of the `vngrs-ai/Kumru-2B` base model. Just as crows (Karga) are known for their exceptional problem-solving skills and tool use, this model has been explicitly engineered to bring **Chain-of-Thought (CoT)** reasoning capabilities to a 2-Billion parameter Small Language Model (SLM) for the Turkish language.
By generating `<think> ... </think>` block before answering, the model significantly reduces hallucinations and logically plans its outputs, making it highly effective for mathematics, logic puzzles, and code generation on Edge devices.
> ⚠️ **Academic Pre-Publication Notice**
> *This model serves as the official checkpoint for an ongoing academic research project. While the model weights are fully open-source (Apache 2.0), the proprietary synthetic dataset and the novel **"Deterministic Tensor Injection Agent"** training/inference architecture are temporarily withheld pending double-blind peer review. Full resources will be released upon publication.*
## 🚀 Model Details
- **Architecture:** MistralForCausalLM (Kumru-2B)
- **Task:** Causal Language Modeling with CoT
- **Parameters:** 2 Billion (Optimized for Edge AI)
- **Language:** Turkish
- **License:** Apache 2.0 (Commercially friendly)
## 📊 Performance & Training
Large Language Models often struggle with complex logic in low-resource languages. To overcome this, the model was trained using **QLoRA/Unsloth** on a highly robust, custom-translated synthetic dataset generated via vLLM pipelines.
On a strictly unseen benchmark of 654 complex questions, the fine-tuning process yielded massive improvements:
- **Mathematics:** 10x Performance Increase (0.49% ➔ 4.93%)
- **Python Coding:** +11.47% Boost (86.89% ➔ 98.36%)
- **Overall Average:** Increased from 21.71% to 24.31%
## 💻 Quick Usage
You can easily integrate this model into your Python projects. The model uses a specific chat template and outputs its reasoning inside `<think>` tags.
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "ilkayO/Karga-2B-Thinking"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.bfloat16
)
prompt = "Aylin'in yaşı, Burak'ın yaşının iki katıdır. Burak 12 yaşında ise, ikisinin yaşları toplamı kaçtır?"
messages = [
{"role": "system", "content": "Adın Karga. Soruları mantıklı ve adım adım düşünerek yanıtla."},
{"role": "user", "content": prompt}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
with torch.no_grad():
outputs = model.generate(
inputs,
max_new_tokens=1024,
temperature=0.6,
top_p=0.9,
repetition_penalty=1.1
)
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)
```
## 🤝 Commercial Integration & Consulting
This model is open-sourced under the **Apache 2.0** license, meaning you are free to use, modify, and integrate it into your commercial products.
If your company is looking to integrate advanced NLP systems, build Agentic AI workflows, deploy Edge AI models, or if you are interested in having me join your AI team, feel free to reach out!
📧 **Contact:** ilkayonay2001@gmail.com | [LinkedIn](https://linkedin.com/in/ilkay-onay-391905254)

11
chat_template.jinja Normal file
View File

@@ -0,0 +1,11 @@
{{ bos_token }}
{% set default_system = "Adın Karga. Soruları mantıklı ve adım adım düşünerek yanıtla." %}
{% if messages[0]['role'] != 'system' %}
{{ '<|start_header_id|>system<|end_header_id|>\n\n' + default_system + '<|eot_id|>' }}
{% endif %}
{% for message in messages %}
{{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'] | trim + '<|eot_id|>' }}
{% endfor %}
{% if add_generation_prompt %}
{{ '<|start_header_id|>assistant<|end_header_id|>\n\n<think>\n' }}
{% endif %}

28
config.json Normal file
View File

@@ -0,0 +1,28 @@
{
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 2,
"torch_dtype": "bfloat16",
"eos_token_id": 3,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 3072,
"initializer_range": 0.02,
"intermediate_size": 10752,
"max_position_embeddings": 8192,
"model_type": "mistral",
"num_attention_heads": 16,
"num_hidden_layers": 18,
"num_key_value_heads": 4,
"pad_token_id": 0,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 500000,
"sliding_window": null,
"tie_word_embeddings": false,
"unsloth_version": "2026.4.4",
"use_cache": true,
"vocab_size": 50176
}

BIN
karga.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5083fd82219ed3ee89dd70a2d5215bb3aed74c50e911c11996710bbd37d7bb98
size 4750295696

30
special_tokens_map.json Normal file
View File

@@ -0,0 +1,30 @@
{
"bos_token": {
"content": "<BOS>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<EOS>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<PAD>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<UNK>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

251278
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

2110
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff