初始化项目,由ModelHub XC社区提供模型
Model: noctrex/LFM2.5-8B-A1B-MXFP4_MOE-GGUF Source: Original Platform
This commit is contained in:
39
.gitattributes
vendored
Normal file
39
.gitattributes
vendored
Normal file
@@ -0,0 +1,39 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
LFM2.5-8B-A1B-MXFP4_MOE.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
LFM2.5-8B-A1B-Q8_XL_MOE.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
3
LFM2.5-8B-A1B-MXFP4_MOE.gguf
Normal file
3
LFM2.5-8B-A1B-MXFP4_MOE.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a492e8b1d8c201b90de421f0ea1b6ed478aa7f4948604ea6659cd9afca715f3a
|
||||
size 5138197824
|
||||
3
LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf
Normal file
3
LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fead589a51b29ea70859116853848ea196386cec0eaecb9f9763be9c377c377a
|
||||
size 5562871104
|
||||
3
LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf
Normal file
3
LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:29fdedb35cbb0c8e64132f3497ccc86ad760aea69243c4e920729b5dee160d3c
|
||||
size 5562871104
|
||||
3
LFM2.5-8B-A1B-Q8_XL_MOE.gguf
Normal file
3
LFM2.5-8B-A1B-Q8_XL_MOE.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:165ee800c4488d7499ad4d3425214c9ba19f2572ce7990fae356b10808a45dfa
|
||||
size 9680629056
|
||||
33
README.md
Normal file
33
README.md
Normal file
@@ -0,0 +1,33 @@
|
||||
---
|
||||
pipeline_tag: text-generation
|
||||
base_model:
|
||||
- LiquidAI/LFM2.5-8B-A1B
|
||||
---
|
||||
These are **MXFP4** quantizations of the model [LiquidAI / LFM2.5-8B-A1B](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B)
|
||||
|
||||
## Quick Start
|
||||
1. Download the latest release of [**llama.cpp**](https://github.com/ggml-org/llama.cpp/releases).
|
||||
2. Download your preferred model variant from below.
|
||||
|
||||
## Which version should I choose?
|
||||
All FP4 variants use **MXFP4** for the MoE (Mixture of Experts) weights to keep the model efficient.
|
||||
I've included also a new type Q8_XL_MOE, that uses Q8_0 for MoE tensors and BF16 for everything else.
|
||||
The difference lies in how the remaining tensors are handled:
|
||||
|
||||
| Variant | Quality | Performance | MoE Tensors | Other Tensors | Size | Recommendation |
|
||||
| :----------------- | :------------ | :---------- | :---------: | :-----------: | ------: | :-------------------------------------------------------------- |
|
||||
| **Q8_XL_MOE** | ⭐⭐⭐⭐⭐ | Variable\* | Q8_0 | FP16 | 9.02GiB | Maximum quality, uses Q8_0 instead of MXFP4 for the MoE weights. |
|
||||
| **MXFP4_MOE_BF16** | ⭐⭐⭐ | Variable\* | MXFP4 | FP16 | 5.18GiB | Best for maximum accuracy; original unquantized weights. |
|
||||
| **MXFP4_MOE_F16** | ⭐⭐ | Fast | MXFP4 | F16 | 5.18GiB | Great alternative if BF16 is slow on your hardware. |
|
||||
| **MXFP4_MOE** | ⭐ | Fastest | MXFP4 | Q8_0 | 4.79GiB | Balanced performance and memory usage. |
|
||||
|
||||
**Note:** On some older architectures, BF16 may be slower than F16.
|
||||
Check that your GPU supports native BF16 acceleration, otherwise it would be better to get the F16 version.
|
||||
|
||||
Recommended parameters from LiquidAI:
|
||||
- temperature 0.2
|
||||
- top_p 80
|
||||
- repetition_penalty 1.05
|
||||
|
||||
The chat template has been updated to fix the tool calling issues.
|
||||
If you don't want to download the model again, you can use the template from the parent model.
|
||||
Reference in New Issue
Block a user