From c8c648229173770bd92e88f7db7d293f4a2192db Mon Sep 17 00:00:00 2001 From: Rasmus Rasmussen Date: Wed, 14 Jan 2026 19:24:38 +0000 Subject: [PATCH] Update README.md --- README.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 740174c..2e7d41b 100644 --- a/README.md +++ b/README.md @@ -8,4 +8,16 @@ pipeline_tag: text-generation tags: - moe - llama ---- \ No newline at end of file +--- + +# theprint-MoE-8x3-0126-GGUF +An 18B parameter Mixture of Experts model combining 8 specialized 3B experts, with 2 experts activated per token by default (configurable up to 4 at inference). + +## Architecture +- Base model: theprint/GeneralChat-Llama3.2-3B (provides shared attention layers) +- Total parameters: ~18B +- Active parameters: ~5B (2 experts) or ~9B (4 experts) +- Gate mode: Hidden (prompt-based router initialization) + +## Full Model +For more information about this model, including access to the safetensor files, please see [theprint/theprint-moe-8x3-0126](https://huggingface.co/theprint/theprint-moe-8x3-0126). \ No newline at end of file