--- license: apache-2.0 base_model: LocoreMind/LocoTrainer-4B library_name: llama.cpp tags: - llama-cpp - gguf-quantized - text-generation - reasoning --- # LocoTrainer-4B GGUF This repository contains GGUF format quantizations of **LocoTrainer-4B**, a distilled reasoning model based on the Qwen architecture. ## Model Description - **Base Model:** [LocoreMind/LocoTrainer-4B](https://huggingface.co/LocoreMind/LocoTrainer-4B) - **Format:** GGUF (compatible with `llama.cpp` and ecosystem) ## Available Quants | File | Size | Description | | :--- | :--- | :--- | | **LocoTrainer-4B-Q4_K_M.gguf** | ~2.5 GB | 4-bit Medium. Best balance of speed and quality for general use. | | **LocoTrainer-4B-Q5_K_M.gguf** | ~2.8 GB | 5-bit Medium. High quality with a slightly larger footprint. | | **LocoTrainer-4B-Q6_K.gguf** | ~3.2 GB | 6-bit. Near-lossless quantization. | | **LocoTrainer-4B-Q8_0.gguf** | ~4.2 GB | 8-bit. Maximum fidelity, equivalent to original weights. | ## Usage ### llama.cpp You can run these models using the `llama-cli`: ```bash ./llama-cli -m LocoTrainer-4B-Q4_K_M.gguf -p "What is the capital of France?" -n 128