初始化项目,由ModelHub XC社区提供模型
Model: theprint/theprint-moe-8x3-0126-GGUF Source: Original Platform
This commit is contained in:
48
.gitattributes
vendored
Normal file
48
.gitattributes
vendored
Normal file
@@ -0,0 +1,48 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-f16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q6_k.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q4_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q4_1.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-iq4_xs.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q3_k_l.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q3_k_s.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint-moe-8x3-0126-q2_k.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
theprint_18b_moe.png filter=lfs diff=lfs merge=lfs -text
|
||||||
25
README.md
Normal file
25
README.md
Normal file
@@ -0,0 +1,25 @@
|
|||||||
|
---
|
||||||
|
license: apache-2.0
|
||||||
|
language:
|
||||||
|
- en
|
||||||
|
base_model:
|
||||||
|
- theprint/theprint-moe-8x3-0126
|
||||||
|
pipeline_tag: text-generation
|
||||||
|
tags:
|
||||||
|
- moe
|
||||||
|
- llama
|
||||||
|
---
|
||||||
|
<img src="theprint_18b_moe.png" width="420" />
|
||||||
|
|
||||||
|
# theprint-MoE-8x3-0126-GGUF
|
||||||
|
|
||||||
|
An 18B parameter Mixture of Experts model combining 8 specialized 3B experts, with 2 experts activated per token by default (configurable up to 4 at inference).
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
- Base model: theprint/GeneralChat-Llama3.2-3B (provides shared attention layers)
|
||||||
|
- Total parameters: ~18B
|
||||||
|
- Active parameters: ~5B (2 experts) or ~9B (4 experts)
|
||||||
|
- Gate mode: Hidden (prompt-based router initialization)
|
||||||
|
|
||||||
|
## Full Model
|
||||||
|
For more information about this model, including access to the safetensor files, please see [theprint/theprint-moe-8x3-0126](https://huggingface.co/theprint/theprint-moe-8x3-0126).
|
||||||
3
theprint-moe-8x3-0126-f16.gguf
Normal file
3
theprint-moe-8x3-0126-f16.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:8c40c59eca47c7b1269a549f35a2c17fe0a983977df22ef90dc6492533bd3ab9
|
||||||
|
size 36031453024
|
||||||
3
theprint-moe-8x3-0126-iq4_xs.gguf
Normal file
3
theprint-moe-8x3-0126-iq4_xs.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:6c21c1b7f1f38b4b956d0036f493acf4eba2d2eb5ebef2149a74690bffe845a5
|
||||||
|
size 9922285408
|
||||||
3
theprint-moe-8x3-0126-q2_k.gguf
Normal file
3
theprint-moe-8x3-0126-q2_k.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:7a409fa818e52738719c4bb3a3cbdd0b75fdad86e98aac1136c941ceee640688
|
||||||
|
size 6911627104
|
||||||
3
theprint-moe-8x3-0126-q3_k_l.gguf
Normal file
3
theprint-moe-8x3-0126-q3_k_l.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:731f9edb9c8de668d76022a317f6732ba264b34f8615643c7500546e0d458dce
|
||||||
|
size 9468710752
|
||||||
3
theprint-moe-8x3-0126-q3_k_m.gguf
Normal file
3
theprint-moe-8x3-0126-q3_k_m.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:e9ac9cb5a99b127052e640091fcaff6d72ef46d26b549875abc99b4d5729a35d
|
||||||
|
size 8857358176
|
||||||
3
theprint-moe-8x3-0126-q3_k_s.gguf
Normal file
3
theprint-moe-8x3-0126-q3_k_s.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:d5f6e31e2d77829a5cdcdb95acfe19558f5bc11181715baa6476ea4711b990ae
|
||||||
|
size 8083509088
|
||||||
3
theprint-moe-8x3-0126-q4_0.gguf
Normal file
3
theprint-moe-8x3-0126-q4_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:7bb9044f0fef43765eb3b29eaa6d7ff7de7557ae817366488bdc0b6178ba1aae
|
||||||
|
size 10331623264
|
||||||
3
theprint-moe-8x3-0126-q4_1.gguf
Normal file
3
theprint-moe-8x3-0126-q4_1.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:18877865d1fa626ba218f8bfc953957e9cf97b31050229753fd602c4768ceaf6
|
||||||
|
size 11421618016
|
||||||
3
theprint-moe-8x3-0126-q4_k_m.gguf
Normal file
3
theprint-moe-8x3-0126-q4_k_m.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:5d946caab7ada6589c6c88b060ff3dc8510a8e40418022f6ca790a2978b28bac
|
||||||
|
size 11091316576
|
||||||
3
theprint-moe-8x3-0126-q5_k_m.gguf
Normal file
3
theprint-moe-8x3-0126-q5_k_m.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:190e757202d043007908c598ae37409809f8bb0d5e9239f23b4674234e2c6cf4
|
||||||
|
size 12885954400
|
||||||
3
theprint-moe-8x3-0126-q6_k.gguf
Normal file
3
theprint-moe-8x3-0126-q6_k.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:2c288b3e2a15927646c0b0bac36adadb55d4b4bc3ce87296380fec7f8b1ce07e
|
||||||
|
size 14827851616
|
||||||
3
theprint-moe-8x3-0126-q8_0.gguf
Normal file
3
theprint-moe-8x3-0126-q8_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:4254e74cf139b6c2c5f3b06c2f425c2ba428888db868505a5119c97c3e3c5667
|
||||||
|
size 19147003744
|
||||||
3
theprint_18b_moe.png
Normal file
3
theprint_18b_moe.png
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:9c03098ecf11ac7d9156a0e27f6dfcea6f230e42a38f6071c43040d7d368e382
|
||||||
|
size 1849516
|
||||||
Reference in New Issue
Block a user