license, base_model, library_name, tags
license base_model library_name tags
apache-2.0 LocoreMind/LocoTrainer-4B llama.cpp
llama-cpp
gguf-quantized
text-generation
reasoning

LocoTrainer-4B GGUF

This repository contains GGUF format quantizations of LocoTrainer-4B, a distilled reasoning model based on the Qwen architecture.

Model Description

Available Quants

File Size Description
LocoTrainer-4B-Q4_K_M.gguf ~2.5 GB 4-bit Medium. Best balance of speed and quality for general use.
LocoTrainer-4B-Q5_K_M.gguf ~2.8 GB 5-bit Medium. High quality with a slightly larger footprint.
LocoTrainer-4B-Q6_K.gguf ~3.2 GB 6-bit. Near-lossless quantization.
LocoTrainer-4B-Q8_0.gguf ~4.2 GB 8-bit. Maximum fidelity, equivalent to original weights.

Usage

llama.cpp

You can run these models using the llama-cli:

./llama-cli -m LocoTrainer-4B-Q4_K_M.gguf -p "What is the capital of France?" -n 128
Description
Model synced from source: Abhiray/LocoTrainer-4B-GGUF
Readme 25 KiB