Update README.md

This commit is contained in:
Rasmus Rasmussen
2026-01-14 19:24:38 +00:00
committed by system
parent 8df2439c20
commit c8c6482291

View File

@@ -8,4 +8,16 @@ pipeline_tag: text-generation
tags: tags:
- moe - moe
- llama - llama
--- ---
# theprint-MoE-8x3-0126-GGUF
An 18B parameter Mixture of Experts model combining 8 specialized 3B experts, with 2 experts activated per token by default (configurable up to 4 at inference).
## Architecture
- Base model: theprint/GeneralChat-Llama3.2-3B (provides shared attention layers)
- Total parameters: ~18B
- Active parameters: ~5B (2 experts) or ~9B (4 experts)
- Gate mode: Hidden (prompt-based router initialization)
## Full Model
For more information about this model, including access to the safetensor files, please see [theprint/theprint-moe-8x3-0126](https://huggingface.co/theprint/theprint-moe-8x3-0126).