LocoTrainer-4B-GGUF/README.md

---
license: apache-2.0
base_model: LocoreMind/LocoTrainer-4B
library_name: llama.cpp
tags:
- llama-cpp
- gguf-quantized
- text-generation
- reasoning
---

# LocoTrainer-4B GGUF

This repository contains GGUF format quantizations of **LocoTrainer-4B**, a distilled reasoning model based on the Qwen architecture.

## Model Description
- **Base Model:** [LocoreMind/LocoTrainer-4B](https://huggingface.co/LocoreMind/LocoTrainer-4B)
- **Format:** GGUF (compatible with `llama.cpp` and ecosystem)

## Available Quants
| File | Size | Description |
| :--- | :--- | :--- |
| **LocoTrainer-4B-Q4_K_M.gguf** | ~2.5 GB | 4-bit Medium. Best balance of speed and quality for general use. |
| **LocoTrainer-4B-Q5_K_M.gguf** | ~2.8 GB | 5-bit Medium. High quality with a slightly larger footprint. |
| **LocoTrainer-4B-Q6_K.gguf** | ~3.2 GB | 6-bit. Near-lossless quantization. |
| **LocoTrainer-4B-Q8_0.gguf** | ~4.2 GB | 8-bit. Maximum fidelity, equivalent to original weights. |

## Usage

### llama.cpp
You can run these models using the `llama-cli`:
```bash
./llama-cli -m LocoTrainer-4B-Q4_K_M.gguf -p "What is the capital of France?" -n 128
初始化项目，由ModelHub XC社区提供模型 Model: Abhiray/LocoTrainer-4B-GGUF Source: Original Platform 2026-04-11 02:43:58 +08:00			`---`
			`license: apache-2.0`
			`base_model: LocoreMind/LocoTrainer-4B`
			`library_name: llama.cpp`
			`tags:`
			`- llama-cpp`
			`- gguf-quantized`
			`- text-generation`
			`- reasoning`
			`---`

			`# LocoTrainer-4B GGUF`

			`This repository contains GGUF format quantizations of LocoTrainer-4B, a distilled reasoning model based on the Qwen architecture.`

			`## Model Description`
			`- Base Model: [LocoreMind/LocoTrainer-4B](https://huggingface.co/LocoreMind/LocoTrainer-4B)`
			- Format: GGUF (compatible with `llama.cpp` and ecosystem)

			`## Available Quants`
			`\| File \| Size \| Description \|`
			`\| :--- \| :--- \| :--- \|`
			`\| LocoTrainer-4B-Q4_K_M.gguf \| ~2.5 GB \| 4-bit Medium. Best balance of speed and quality for general use. \|`
			`\| LocoTrainer-4B-Q5_K_M.gguf \| ~2.8 GB \| 5-bit Medium. High quality with a slightly larger footprint. \|`
			`\| LocoTrainer-4B-Q6_K.gguf \| ~3.2 GB \| 6-bit. Near-lossless quantization. \|`
			`\| LocoTrainer-4B-Q8_0.gguf \| ~4.2 GB \| 8-bit. Maximum fidelity, equivalent to original weights. \|`

			`## Usage`

			`### llama.cpp`
			You can run these models using the `llama-cli`:
			```bash
			`./llama-cli -m LocoTrainer-4B-Q4_K_M.gguf -p "What is the capital of France?" -n 128`