初始化项目,由ModelHub XC社区提供模型

Model: noctrex/LFM2.5-8B-A1B-MXFP4_MOE-GGUF
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-06 08:54:15 +08:00
commit 9cd025f83b
6 changed files with 84 additions and 0 deletions

39
.gitattributes vendored Normal file
View File

@@ -0,0 +1,39 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-MXFP4_MOE.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-Q8_XL_MOE.gguf filter=lfs diff=lfs merge=lfs -text

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a492e8b1d8c201b90de421f0ea1b6ed478aa7f4948604ea6659cd9afca715f3a
size 5138197824

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fead589a51b29ea70859116853848ea196386cec0eaecb9f9763be9c377c377a
size 5562871104

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:29fdedb35cbb0c8e64132f3497ccc86ad760aea69243c4e920729b5dee160d3c
size 5562871104

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:165ee800c4488d7499ad4d3425214c9ba19f2572ce7990fae356b10808a45dfa
size 9680629056

33
README.md Normal file
View File

@@ -0,0 +1,33 @@
---
pipeline_tag: text-generation
base_model:
- LiquidAI/LFM2.5-8B-A1B
---
These are **MXFP4** quantizations of the model [LiquidAI / LFM2.5-8B-A1B](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B)
## Quick Start
1. Download the latest release of [**llama.cpp**](https://github.com/ggml-org/llama.cpp/releases).
2. Download your preferred model variant from below.
## Which version should I choose?
All FP4 variants use **MXFP4** for the MoE (Mixture of Experts) weights to keep the model efficient.
I've included also a new type Q8_XL_MOE, that uses Q8_0 for MoE tensors and BF16 for everything else.
The difference lies in how the remaining tensors are handled:
| Variant | Quality | Performance | MoE Tensors | Other Tensors | Size | Recommendation |
| :----------------- | :------------ | :---------- | :---------: | :-----------: | ------: | :-------------------------------------------------------------- |
| **Q8_XL_MOE** | ⭐⭐⭐⭐⭐ | Variable\* | Q8_0 | FP16 | 9.02GiB | Maximum quality, uses Q8_0 instead of MXFP4 for the MoE weights. |
| **MXFP4_MOE_BF16** | ⭐⭐⭐ | Variable\* | MXFP4 | FP16 | 5.18GiB | Best for maximum accuracy; original unquantized weights. |
| **MXFP4_MOE_F16** | ⭐⭐ | Fast | MXFP4 | F16 | 5.18GiB | Great alternative if BF16 is slow on your hardware. |
| **MXFP4_MOE** | ⭐ | Fastest | MXFP4 | Q8_0 | 4.79GiB | Balanced performance and memory usage. |
**Note:** On some older architectures, BF16 may be slower than F16.
Check that your GPU supports native BF16 acceleration, otherwise it would be better to get the F16 version.
Recommended parameters from LiquidAI:
- temperature 0.2
- top_p 80
- repetition_penalty 1.05
The chat template has been updated to fix the tool calling issues.
If you don't want to download the model again, you can use the template from the parent model.