commit 9cd025f83bf30043470b12b14be99866d7178770 Author: ModelHub XC Date: Sat Jun 6 08:54:15 2026 +0800 初始化项目,由ModelHub XC社区提供模型 Model: noctrex/LFM2.5-8B-A1B-MXFP4_MOE-GGUF Source: Original Platform diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..d679941 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,39 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +LFM2.5-8B-A1B-MXFP4_MOE.gguf filter=lfs diff=lfs merge=lfs -text +LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf filter=lfs diff=lfs merge=lfs -text +LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf filter=lfs diff=lfs merge=lfs -text +LFM2.5-8B-A1B-Q8_XL_MOE.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/LFM2.5-8B-A1B-MXFP4_MOE.gguf b/LFM2.5-8B-A1B-MXFP4_MOE.gguf new file mode 100644 index 0000000..3997a2a --- /dev/null +++ b/LFM2.5-8B-A1B-MXFP4_MOE.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a492e8b1d8c201b90de421f0ea1b6ed478aa7f4948604ea6659cd9afca715f3a +size 5138197824 diff --git a/LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf b/LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf new file mode 100644 index 0000000..dd2afaf --- /dev/null +++ b/LFM2.5-8B-A1B-MXFP4_MOE_BF16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fead589a51b29ea70859116853848ea196386cec0eaecb9f9763be9c377c377a +size 5562871104 diff --git a/LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf b/LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf new file mode 100644 index 0000000..7ea48e3 --- /dev/null +++ b/LFM2.5-8B-A1B-MXFP4_MOE_F16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:29fdedb35cbb0c8e64132f3497ccc86ad760aea69243c4e920729b5dee160d3c +size 5562871104 diff --git a/LFM2.5-8B-A1B-Q8_XL_MOE.gguf b/LFM2.5-8B-A1B-Q8_XL_MOE.gguf new file mode 100644 index 0000000..9d5df2d --- /dev/null +++ b/LFM2.5-8B-A1B-Q8_XL_MOE.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:165ee800c4488d7499ad4d3425214c9ba19f2572ce7990fae356b10808a45dfa +size 9680629056 diff --git a/README.md b/README.md new file mode 100644 index 0000000..6700800 --- /dev/null +++ b/README.md @@ -0,0 +1,33 @@ +--- +pipeline_tag: text-generation +base_model: +- LiquidAI/LFM2.5-8B-A1B +--- +These are **MXFP4** quantizations of the model [LiquidAI / LFM2.5-8B-A1B](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B) + +## Quick Start +1. Download the latest release of [**llama.cpp**](https://github.com/ggml-org/llama.cpp/releases). +2. Download your preferred model variant from below. + +## Which version should I choose? +All FP4 variants use **MXFP4** for the MoE (Mixture of Experts) weights to keep the model efficient. +I've included also a new type Q8_XL_MOE, that uses Q8_0 for MoE tensors and BF16 for everything else. +The difference lies in how the remaining tensors are handled: + +| Variant | Quality | Performance | MoE Tensors | Other Tensors | Size | Recommendation | +| :----------------- | :------------ | :---------- | :---------: | :-----------: | ------: | :-------------------------------------------------------------- | +| **Q8_XL_MOE** | ⭐⭐⭐⭐⭐ | Variable\* | Q8_0 | FP16 | 9.02GiB | Maximum quality, uses Q8_0 instead of MXFP4 for the MoE weights. | +| **MXFP4_MOE_BF16** | ⭐⭐⭐ | Variable\* | MXFP4 | FP16 | 5.18GiB | Best for maximum accuracy; original unquantized weights. | +| **MXFP4_MOE_F16** | ⭐⭐ | Fast | MXFP4 | F16 | 5.18GiB | Great alternative if BF16 is slow on your hardware. | +| **MXFP4_MOE** | ⭐ | Fastest | MXFP4 | Q8_0 | 4.79GiB | Balanced performance and memory usage. | + +**Note:** On some older architectures, BF16 may be slower than F16. +Check that your GPU supports native BF16 acceleration, otherwise it would be better to get the F16 version. + +Recommended parameters from LiquidAI: +- temperature 0.2 +- top_p 80 +- repetition_penalty 1.05 + +The chat template has been updated to fix the tool calling issues. +If you don't want to download the model again, you can use the template from the parent model.