--- license: apache-2.0 language: - ka - en - multilingual library_name: transformers pipeline_tag: text-generation tags: - llm - georgian - instruct - chat - function-calling - conversational datasets: - tbilisi-ai-lab/kona-sft-mix-2.6M - tbilisi-ai-lab/kona-sft-function-calling-115k - tbilisi-ai-lab/kona-sft-function-calling-ka-93k base_model: - tbilisi-ai-lab/kona2-12B-Base --- # Kona2-12B-Instruct **Kona2-12B-Instruct** is a 12-billion parameter instruction-tuned language model for Georgian and English. Built on [Kona2-12B-Base](https://huggingface.co/tbilisi-ai-lab/kona2-12B-Base) through supervised fine-tuning (SFT), it excels at chat, question answering, and function calling. ## Model Summary | Property | Value | |----------|-------| | Parameters | 12B | | Architecture | Mistral (Transformer) | | Context Length | 32K tokens | | Languages | Georgian (ka), English (en), other (limited) | | Training | Supervised Fine-Tuning (SFT) | | Training Examples | ~2.8M instructions | | Function Calling | Yes (Hermes format) | | Base Model | [kona2-12B-Base](https://huggingface.co/tbilisi-ai-lab/kona2-12B-Base) | ## Intended Uses ### Primary Use Cases - Conversational AI assistants (Georgian/English) - Question answering and information retrieval - Function/tool calling applications - **Translation between Georgian and English** (especially strong) - Code generation and explanation - Educational and tutoring applications ## Training ### Training Data | Dataset | Examples | Description | |---------|----------|-------------| | [kona-sft-mix-2.6M](https://huggingface.co/datasets/tbilisi-ai-lab/kona-sft-mix-2.6M) | 2,606,173 | Mixed instruction dataset (KA/EN) | | [kona-sft-function-calling-115k](https://huggingface.co/datasets/tbilisi-ai-lab/kona-sft-function-calling-115k) | ~115K | Function calling (English) | | [kona-sft-function-calling-ka-93k](https://huggingface.co/datasets/tbilisi-ai-lab/kona-sft-function-calling-ka-93k) | ~93K | Function calling (Georgian) | **Data Sources Include:** - Wikipedia Q&A (RAFT-generated) - Orca-style reasoning - Self-instruct (Alpaca-style) - Translation pairs (EN-KA) - Code instructions - Math instructions - PersonaHub reasoning - Glaive & Hermes function calling ### Training Procedure - **Method:** Supervised Fine-Tuning (SFT) - **LoRA Config:** r=256, alpha=512 - **Learning Rate:** 3e-5 - **Epochs:** 2 - **Training Context:** 32K tokens - **Packing:** Enabled - **Precision:** BF16 - **Infrastructure:** DeepSpeed ZeRO-2 ## Usage ### Installation ```bash pip install transformers torch accelerate ``` ### Chat Completion ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "tbilisi-ai-lab/kona2-12B-Instruct", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("tbilisi-ai-lab/kona2-12B-Instruct") messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "რა არის საქართველოს დედაქალაქი?"} ] inputs = tokenizer.apply_chat_template( messages, return_tensors="pt", add_generation_prompt=True ).to(model.device) outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Function Calling (Hermes Format) See the tokenizer's jinja template (`tokenizer_config.json`) for details on how function calling is formatted. ```python tools = [ { "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a location", "parameters": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"} }, "required": ["location"] } } } ] messages = [ {"role": "system", "content": "You are a helpful assistant with access to tools."}, {"role": "user", "content": "What's the weather in Tbilisi?"} ] inputs = tokenizer.apply_chat_template( messages, tools=tools, return_tensors="pt", add_generation_prompt=True ).to(model.device) outputs = model.generate(inputs, max_new_tokens=256) # Output will include {"name": "get_weather", "arguments": {"location": "Tbilisi"}} ``` ## Related Models | Model | Description | |-------|-------------| | [kona2-12B-Base](https://huggingface.co/tbilisi-ai-lab/kona2-12B-Base) | Pre-trained base model | | [kona2-12B](https://huggingface.co/tbilisi-ai-lab/kona2-12B) | DPO-aligned version (recommended) | | [kona2-small-3.8B](https://huggingface.co/tbilisi-ai-lab/kona2-small-3.8B) | Smaller 3.8B model | ## Limitations - Training data cutoff: 2024 ## Technical Specifications - **Precision:** BF16/FP16 supported - **Minimum VRAM:** 24GB (with 4-bit quantization) - **Recommended:** 48GB+ for full precision ## Citation ```bibtex @misc{tbilisi2025kona2instruct, title = {Kona2-12B-Instruct: A Georgian Instruction-Tuned Language Model}, author = {Tbilisi AI Lab Team}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/tbilisi-ai-lab/kona2-12B-Instruct}} } ``` ## License This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). ## Contact - **Organization:** [Tbilisi AI Lab](https://huggingface.co/tbilisi-ai-lab) - **Website:** [ailab.ge](https://ailab.ge) - **Chat:** [chat.ailab.ge](https://chat.ailab.ge) - **API:** [api.ailab.ge](https://api.ailab.ge)