diff --git a/README.md b/README.md index 740174c..2e7d41b 100644 --- a/README.md +++ b/README.md @@ -8,4 +8,16 @@ pipeline_tag: text-generation tags: - moe - llama ---- \ No newline at end of file +--- + +# theprint-MoE-8x3-0126-GGUF +An 18B parameter Mixture of Experts model combining 8 specialized 3B experts, with 2 experts activated per token by default (configurable up to 4 at inference). + +## Architecture +- Base model: theprint/GeneralChat-Llama3.2-3B (provides shared attention layers) +- Total parameters: ~18B +- Active parameters: ~5B (2 experts) or ~9B (4 experts) +- Gate mode: Hidden (prompt-based router initialization) + +## Full Model +For more information about this model, including access to the safetensor files, please see [theprint/theprint-moe-8x3-0126](https://huggingface.co/theprint/theprint-moe-8x3-0126). \ No newline at end of file