初始化项目，由ModelHub XC社区提供模型

Model: Mungert/LFM2.5-8B-A1B-GGUF Source: Original Platform
2026-06-17 15:32:17 +08:00
commit 7509cbbc1b
29 changed files with 488 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,62 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-f16.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-f16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-bf16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q3_k_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q4_k_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q5_k_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q6_k_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q6_k_m.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q4_0.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q4_1.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q4_0_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q4_1_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q5_0.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q5_1.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q5_0_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-q5_1_l.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-iq3_xs.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-iq3_xxs.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-iq3_s.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-iq3_m.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-iq4_xs.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-mxfp4_moe.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-imatrix.gguf filter=lfs diff=lfs merge=lfs -text
+LFM2.5-8B-A1B-bf16.gguf filter=lfs diff=lfs merge=lfs -text
--- a/LFM2.5-8B-A1B-bf16.gguf
+++ b/LFM2.5-8B-A1B-bf16.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2b471037695ef5908cb88065c7ebb23fe285a07950d7433fabf0c865273728ab
+size 16947260640
--- a/LFM2.5-8B-A1B-bf16_q8_0.gguf
+++ b/LFM2.5-8B-A1B-bf16_q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:674967ea5846e244aaf9c8d720834b9b53b6c126d01d701297c8a9cead386c40
+size 15278058720
--- a/LFM2.5-8B-A1B-f16.gguf
+++ b/LFM2.5-8B-A1B-f16.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:054a4233007c76e31ee75a99bdb9a7748d29099c98c5b43c17964b62e940fc1f
+size 16947260640
--- a/LFM2.5-8B-A1B-f16_q8_0.gguf
+++ b/LFM2.5-8B-A1B-f16_q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:34f25fc3cfb92d31d371826772b229c861f6986c770401e192b24e12180b9ee4
+size 15278058720
--- a/LFM2.5-8B-A1B-imatrix.gguf
+++ b/LFM2.5-8B-A1B-imatrix.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8ae78c675c5a05745b141c4115a7d30c928a03b79fd11c32df3ca1096a5bc49c
+size 17375264
--- a/LFM2.5-8B-A1B-iq3_m.gguf
+++ b/LFM2.5-8B-A1B-iq3_m.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a5ae4abaaf366468ccd26e7b3d781052fc423938a5a3d26f31880caf9ecce9e8
+size 4291800608
--- a/LFM2.5-8B-A1B-iq3_s.gguf
+++ b/LFM2.5-8B-A1B-iq3_s.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1db5b78344de6706118b84280ef3993c15dbd9410e6d75f9ce60bee69a8b530f
+size 4291800608
--- a/LFM2.5-8B-A1B-iq3_xs.gguf
+++ b/LFM2.5-8B-A1B-iq3_xs.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:27a82461476865f19db1c17527f1d7d07a722ade6c76e0142434ed05bb4e6dd7
+size 3987533344
--- a/LFM2.5-8B-A1B-iq3_xxs.gguf
+++ b/LFM2.5-8B-A1B-iq3_xxs.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c5d3672095e1c66ac355a308920c07223dc30025c525f60b6b49062c57866396
+size 3970625056
--- a/LFM2.5-8B-A1B-iq4_xs.gguf
+++ b/LFM2.5-8B-A1B-iq4_xs.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:f0138fafdad549b69ea6c064a7e96304c4419776f12e013a8be07bd94bcca602
+size 4588301856
--- a/LFM2.5-8B-A1B-mxfp4_moe.gguf
+++ b/LFM2.5-8B-A1B-mxfp4_moe.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5374ff486246d4d52c86e3e94957d19d72cc04015cfe8fcae1026adfbfeb2c78
+size 4892437728
--- a/LFM2.5-8B-A1B-q3_k_l.gguf
+++ b/LFM2.5-8B-A1B-q3_k_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:94b5104aace0b8cc11215ecc129dfa0d2bd118403de2fb89c763bae910758754
+size 4306300448
--- a/LFM2.5-8B-A1B-q3_k_m.gguf
+++ b/LFM2.5-8B-A1B-q3_k_m.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e79a2b3cd1c75f1d0a152870189cb563ff6c5c326d42678b934eddc7c9057f65
+size 4242812448
--- a/LFM2.5-8B-A1B-q4_0.gguf
+++ b/LFM2.5-8B-A1B-q4_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c61d5e4e0ae39e26a942adc1170f3ed779b8899bf2a9fe88b68f7775626c2664
+size 4777094688
--- a/LFM2.5-8B-A1B-q4_0_l.gguf
+++ b/LFM2.5-8B-A1B-q4_0_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d46cd56641f02fcb6ba9fa5ed72035195412734d1c5be732a4466bfa3b737105
+size 4908166688
--- a/LFM2.5-8B-A1B-q4_1.gguf
+++ b/LFM2.5-8B-A1B-q4_1.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4cd27656180bf37138ce4d52f10c5a8a17893d6fd6923ff6e975be244b23edf4
+size 5306232352
--- a/LFM2.5-8B-A1B-q4_1_l.gguf
+++ b/LFM2.5-8B-A1B-q4_1_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:db52de524a2ef4434b54f20b73be60650419e4e7e1b8c8ad64e604dc19cfa8f5
+size 5420920352
--- a/LFM2.5-8B-A1B-q4_k_l.gguf
+++ b/LFM2.5-8B-A1B-q4_k_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ad568ac02b76c6e7a718a3bddc08e48d7d1e86262b1014756f7474916566750a
+size 5278584352
--- a/LFM2.5-8B-A1B-q5_0.gguf
+++ b/LFM2.5-8B-A1B-q5_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:4243df7ff259505b0ea2d05a939370cad606ed3bc6af6937ed2ec36153ae6d8e
+size 5835370016
--- a/LFM2.5-8B-A1B-q5_0_l.gguf
+++ b/LFM2.5-8B-A1B-q5_0_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:325fc9c5f294d416b4d90376c021b18271cdc7b09d56d479298b2a5c64a8b568
+size 5933674016
--- a/LFM2.5-8B-A1B-q5_1.gguf
+++ b/LFM2.5-8B-A1B-q5_1.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:1b4b68d51b1427f36431770f7ce1663a3403f537a25b8d52ff164fe82a2f2b9d
+size 6364507680
--- a/LFM2.5-8B-A1B-q5_1_l.gguf
+++ b/LFM2.5-8B-A1B-q5_1_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c67c57be9648b5cf79f2c78851fa3887c616d5a70142e389378e38d3c4d18fb6
+size 6446427680
--- a/LFM2.5-8B-A1B-q5_k_l.gguf
+++ b/LFM2.5-8B-A1B-q5_k_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:c1789a2757c9d0b60ca8729b603503d74d52dd45a81959ae90f7cf0961ccba45
+size 6227660320
--- a/LFM2.5-8B-A1B-q5_k_m.gguf
+++ b/LFM2.5-8B-A1B-q5_k_m.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:5e626cdae134c0f41319d107e5a02274e19fafea75508359456dd202d7bb5dac
+size 6164172320
--- a/LFM2.5-8B-A1B-q6_k_l.gguf
+++ b/LFM2.5-8B-A1B-q6_k_l.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:355c94bf256d89c001dadffdc459eaf3b65fd53803f82b1c950ec7b458159d84
+size 7023275552
--- a/LFM2.5-8B-A1B-q6_k_m.gguf
+++ b/LFM2.5-8B-A1B-q6_k_m.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:09e8acad5e7c8e4fa40d38b32fd35387e15707f8e12287244a4153557aa3a367
+size 6959787552
--- a/LFM2.5-8B-A1B-q8_0.gguf
+++ b/LFM2.5-8B-A1B-q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:33ab3b8ce6a964fb8ebac89360c9b3cf72c4fa418d5e4c0a94d46883124d5c02
+size 9010195680
--- a/README.md
+++ b/README.md
@@ -0,0 +1,345 @@
+---
+library_name: transformers
+license: other
+license_name: lfm1.0
+license_link: LICENSE
+language:
+- en
+- ar
+- zh
+- fr
+- de
+- ja
+- ko
+- es
+- pt
+pipeline_tag: text-generation
+tags:
+- liquid
+- lfm2.5
+- edge
+base_model: LiquidAI/LFM2.5-8B-A1B-Base
+---
+
+# <span style="color: #7FFF7F;">LFM2.5-8B-A1B GGUF Models</span>
+
+
+## <span style="color: #7F7FFF;">Model Generation Details</span>
+
+This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cpp) at commit [`94a220cd6`](https://github.com/ggerganov/llama.cpp/commit/94a220cd6745e6e3f8de62870b66fd5b9bc92700).
+
+
+
+
+
+
+---
+
+<a href="https://readyforquantum.com/huggingface_gguf_selection_guide.html" style="color: #7FFF7F;">
+  Click here to get info on choosing the right GGUF model format
+</a>
+
+---
+
+
+
+<!--Begin Original Model Card-->
+
+
+<div align="center">
+  <img 
+    src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/2b08LKpev0DNEk6DlnWkY.png" 
+    alt="Liquid AI" 
+    style="width: 100%; max-width: 100%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
+  />
+  <div style="display: flex; justify-content: center; gap: 0.5em; margin-bottom: 1em;">
+    <a href="https://playground.liquid.ai/"><strong>Try LFM</strong></a> • 
+    <a href="https://docs.liquid.ai/lfm/getting-started/welcome"><strong>Docs</strong></a> • 
+    <a href="https://leap.liquid.ai/"><strong>LEAP</strong></a> • 
+    <a href="https://discord.com/invite/liquid-ai"><strong>Discord</strong></a>
+  </div>
+</div>
+
+
+# LFM2.5-8B-A1B
+
+LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.
+
+- **On-device personal assistant**: Designed to power real-life applications, chaining tool calls, and following complex instructions on all devices.
+- **Compressed performance**: Competitive with much larger dense and MoE models on instruction following and agentic tasks.
+- **Unmatched throughput**: Fastest in its size class on both CPU and GPU inference, with day-one support for llama.cpp, MLX, vLLM, and SGLang.
+
+Find more information about LFM2.5-8B-A1B in our [blog post](https://www.liquid.ai/blog/lfm2-5-8b-a1b).
+
+![image](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/qUZVGkns1bg3sZUShBbhv.png)
+
+**AA-Omniscience Index (higher is better) rewards correct answers and penalizes hallucinations. Scores range from -100 to 100. See more results on [Artificial Analysis](https://artificialanalysis.ai/evaluations/omniscience).*
+
+## 🗒️ Model Details
+
+| Model | Parameters | Description |
+| --- | --- | --- |
+| [LFM2.5-8B-A1B-Base](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-Base) | 8.3B total / 1.5B active | Pre-trained base model for fine-tuning |
+| [**LFM2.5-8B-A1B**](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B) | 8.3B total / 1.5B active | Reasoning-tuned general-purpose model |
+
+LFM2.5-8B-A1B is a general-purpose text-only model with the following features:
+
+- **Total parameters**: 8.3B
+- **Active parameters**: 1.5B
+- **Number of layers**: 24 (18 double-gated LIV conv + 6 GQA)
+- **Training budget**: 38 trillion tokens
+- **Context length**: 128,000
+- **Vocabulary size**: 128,000
+- **Languages**: English, Arabic, Chinese, French, German, Italian, Japanese, Korean, Portuguese, Spanish
+- **Generation parameters**: We recommend the following parameters:
+  - `temperature: 0.2`
+  - `top_k: 80`
+  - `repetition_penalty: 1.05`
+
+| Model | Description |
+| --- | --- |
+| [**LFM2.5-8B-A1B**](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers, vLLM, and SGLang. |
+| [LFM2.5-8B-A1B-GGUF](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for edge inference and local deployment. |
+| [LFM2.5-8B-A1B-ONNX](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-ONNX) | ONNX Runtime format for cross-platform deployment. |
+| [LFM2.5-8B-A1B-MLX](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-MLX-8bit) | MLX format for Apple Silicon. Optimized for fast inference on Mac devices. |
+
+We recommend using LFM2.5-8B-A1B for agentic workflows, tool use, structured outputs, multilingual assistants, and on-device personal-assistant applications. It is not the best fit for heavy programming or knowledge-intensive question answering without retrieval.
+
+### Chat Template
+
+LFM2.5 uses a ChatML-like format. See the [Chat Template documentation](https://docs.liquid.ai/lfm/key-concepts/chat-template) for details. Example:
+
+```
+<|startoftext|><|im_start|>system
+You are a helpful assistant trained by Liquid AI.<|im_end|>
+<|im_start|>user
+What is C. elegans?<|im_end|>
+<|im_start|>assistant
+```
+
+Because LFM2.5-8B-A1B is a reasoning model, assistant turns contain an explicit chain of thought before the final answer. You can use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#using-applychattemplate) to format your messages automatically.
+
+### Tool Use
+
+LFM2.5 supports function calling in four steps:
+
+1. **Function definition**: Provide the list of tools as a JSON object in the system prompt, or use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_extras#passing-tools) with `tools=...`.
+2. **Function call**: By default, LFM2.5 writes Pythonic function calls (a Python list between `<|tool_call_start|>` and `<|tool_call_end|>` special tokens), as the assistant answer. You can override this behavior by asking the model to output JSON function calls in the system prompt.
+3. **Function execution**: Execute the call and return the result with the `tool` role.
+4. **Final answer**: LFM2.5 interprets the tool output and returns a plain-text answer addressing the original prompt.
+
+See the [Tool Use documentation](https://docs.liquid.ai/lfm/key-concepts/tool-use) for the full guide. Example:
+
+```
+<|startoftext|><|im_start|>system
+List of tools: [{"name": "get_candidate_status", "description": "Retrieves the current status of a candidate in the recruitment process", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "Unique identifier for the candidate"}}, "required": ["candidate_id"]}}]<|im_end|>
+<|im_start|>user
+What is the current status of candidate ID 12345?<|im_end|>
+<|im_start|>assistant
+<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>Checking the current status of candidate ID 12345.<|im_end|>
+<|im_start|>tool
+[{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|>
+<|im_start|>assistant
+The candidate with ID 12345 is currently in the "Interview Scheduled" stage for the position of Clinical Research Associate, with an interview date set for 2023-11-20.<|im_end|>
+```
+
+## 🏃 Inference
+
+LFM2.5-8B-A1B is supported by many inference frameworks. See the [Inference documentation](https://docs.liquid.ai/lfm/inference/transformers) for the full list.
+
+| Name | Description | Docs | Notebook |
+|------|-------------|------|:--------:|
+| [Transformers](https://github.com/huggingface/transformers) | Simple inference with direct access to model internals. | <a href="https://docs.liquid.ai/lfm/inference/transformers">Link</a> | <a href="https://colab.research.google.com/drive/1_q3jQ6LtyiuPzFZv7Vw8xSfPU5FwkKZY?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| [vLLM](https://github.com/vllm-project/vllm) | High-throughput production deployments with GPU. | <a href="https://docs.liquid.ai/lfm/inference/vllm">Link</a> | <a href="https://colab.research.google.com/drive/1VfyscuHP8A3we_YpnzuabYJzr5ju0Mit?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| [llama.cpp](https://github.com/ggml-org/llama.cpp) | Cross-platform inference with CPU offloading. | <a href="https://docs.liquid.ai/lfm/inference/llama-cpp">Link</a> | <a href="https://colab.research.google.com/drive/1ohLl3w47OQZA4ELo46i5E4Z6oGWBAyo8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| [MLX](https://github.com/ml-explore/mlx) | Apple's machine learning framework optimized for Apple Silicon. | <a href="https://docs.liquid.ai/lfm/inference/mlx">Link</a> | — |
+| [LM Studio](https://lmstudio.ai/) | Desktop application for running LLMs locally. | <a href="https://docs.liquid.ai/lfm/inference/lm-studio">Link</a> | — |
+
+Quick start with Transformers (compatible with `transformers>=5.0.0`):
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
+
+model_id = "LiquidAI/LFM2.5-8B-A1B"
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    device_map="auto",
+    dtype="bfloat16",
+#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
+)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
+
+prompt = "What is C. elegans?"
+
+input_ids = tokenizer.apply_chat_template(
+    [{"role": "user", "content": prompt}],
+    add_generation_prompt=True,
+    return_tensors="pt",
+    tokenize=True,
+)["input_ids"].to(model.device)
+
+output = model.generate(
+    input_ids,
+    do_sample=True,
+    temperature=0.2,
+    top_k=80,
+    repetition_penalty=1.05,
+    max_new_tokens=8192,
+    streamer=streamer,
+)
+```
+
+## 🔧 Fine-Tuning
+
+We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.
+
+| Name | Description | Docs | Notebook |
+|------|-------------|------|----------|
+| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for text completion. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/10fm7eNMezs-DSn36mF7vAsNYlOsx9YZO?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for translation. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1gaP8yTle2_v35Um8Gpu9239fqbU7UgY8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| SFT ([Unsloth](https://github.com/unslothai/unsloth)) | Supervised Fine-Tuning with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1vGRg4ksRj__6OLvXkHhvji_Pamv801Ss?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| SFT ([TRL](https://github.com/huggingface/trl)) | Supervised Fine-Tuning with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1j5Hk_SyBb2soUsuhU0eIEA9GwLNRnElF?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| DPO ([TRL](https://github.com/huggingface/trl)) | Direct Preference Optimization with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1MQdsPxFHeZweGsNx4RH7Ia8lG8PiGE1t?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| GRPO ([Unsloth](https://github.com/unslothai/unsloth)) | GRPO with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1mIikXFaGvcW4vXOZXLbVTxfBRw_XsXa5?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+| GRPO ([TRL](https://github.com/huggingface/trl)) | GRPO with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/github/Liquid4All/cookbook/blob/main/finetuning/notebooks/grpo_for_verifiable_tasks.ipynb"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
+
+## 📊 Performance
+
+### Improvements over LFM2-8B-A1B
+
+Thanks to reasoning, scaled-up pre-training, and large-scale RL, LFM2.5-8B-A1B improves over its predecessor across the board:
+
+| Benchmark | LFM2-8B-A1B | LFM2.5-8B-A1B | Δ |
+| :--- | ---: | ---: | ---: |
+| AA-Omniscience Index | -78.42 | -24.70 | +53.62 |
+| AA-Omniscience Accuracy | 7.33 | 8.67 | +1.34 |
+| AA-Omniscience Non-Hallucination Rate | 7.46 | 63.47 | +56.01 |
+| IFEval | 79.44 | 91.84 | +12.40 |
+| IFBench | 26.00 | 56.47 | +30.47 |
+| Multi-IF | 58.54 | 79.93 | +21.39 |
+| MATH500 | 74.80 | 88.76 | +13.96 |
+| AIME25 | 20.00 | 42.53 | +22.53 |
+| BFCLv3 | 45.07 | 64.36 | +19.29 |
+| BFCLv4 | 25.52 | 48.50 | +22.98 |
+| Tau² Telecom | 13.60 | 88.07 | +74.47 |
+| Tau² Retail | 7.02 | 39.82 | +32.80 |
+
+### Knowledge and instruction following
+
+| Model | Parameters | AA-Omni. Index | AA-Omni. Accuracy | AA-Omni. Non-Halluc. | IFEval | IFBench | Multi-IF |
+| :--- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
+| LFM2.5-8B-A1B | 8B/A1B | -24.70 | 8.67 | 63.47 | 91.84 | 56.47 | 79.93 | |
+| Granite-4.0-H-Tiny | 7B/A1B | -75.50 | 9.37 | 6.38 | 82.23 | 21.28 | 59.00 | |
+| Qwen3.5-4B | 4B | -51.53 | 17.20 | 16.99 | 87.80 | 50.38 | 67.43 | |
+| Qwen3-30B-A3B-Thinking-2507 | 30.5B/3.3B | -51.31 | 18.80 | 13.87 | 90.82 | 51.11 | 79.04 | |
+| Gemma-4-E2B-IT | 5.1B | -72 | 7.00 | 15.05 | 82.93 | 33.53 | 69.70 | |
+| Gemma-4-E4B-IT | 8B | -50.67 | 8.10 | 36.06 | 87.74 | 39.48 | 77.58 | |
+| Gemma-4-26B-A4B-IT | 26B/4B | -62.07 | 14.37 | 10.75 | 91.40 | 47.25 | 82.06 | |
+| gpt-oss-20b | 21B/3.6B | -49.17 | 14.57 | 24.50 | 86.73 | 58.65 | 76.64 | |
+
+### Math and agentic workflows
+
+| Model | Parameters | MATH500 | AIME25 | AIME26 | BFCLv3 | BFCLv4 | Tau² Telecom | Tau² Retail |
+|---|---|---|---|---|---|---|---|---|
+| LFM2.5-8B-A1B | 8B/A1B | 88.76 | 42.53 | 50.00 | 64.79 | 49.73 | 88.07 | 39.82 |
+| Granite-4.0-H-Tiny | 7B/A1B | 59.20 | 4.93 | 3.33 | 56.89 | 28.52 | 16.67 | 18.42 |
+| Qwen3.5-4B | 4B | 80.76 | 54.28 | 58.33 | 71.06 | 54.01 | 87.72 | 71.93 |
+| Qwen3-30B-A3B-Thinking-2507 | 30.5B/3.3B | 86.48 | 71.67 | 66.67 | 73.39 | 50.53 | 21.93 | 56.14 |
+| Gemma-4-E2B-IT | 5.1B | 64.00 | 26 | 30 | 56.44 | 31.91 | 22.37 | 18.95 |
+| Gemma-4-E4B-IT | 8B | 65.00 | 34.33 | 40.67 | 57.31 | 33.92 | 26.75 | 42.11 |
+
+### CPU Inference
+
+![image](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/yWAChLNCguGTl9lXBL47p.png)
+
+### GPU Inference
+
+LFM2.5-8B-A1B is the fastest model in its size class, reaching **18.5K output tokens per second at high concurrency**, over 1.6B tokens per day on a single H100.
+
+![image](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/LX3oIXQeDm51eaLQs64an.png)
+
+## 📬 Contact
+
+- Got questions or want to connect? [Join our Discord community](https://discord.com/invite/liquid-ai).
+- If you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact).
+
+## Citation
+
+```bibtex
+@article{liquidAI20268BA1B,
+  author  = {Liquid AI},
+  title   = {LFM2.5-8B-A1B: Personal Assistant On Your Laptop},
+  journal = {Liquid AI Blog},
+  year    = {2026},
+  note    = {www.liquid.ai/blog/lfm2-5-8b-a1b},
+}
+```
+
+```bibtex
+@article{liquidai2025lfm2,
+  title   = {LFM2 Technical Report},
+  author  = {Liquid AI},
+  journal = {arXiv preprint arXiv:2511.23404},
+  year    = {2025}
+}
+```
+
+<!--End Original Model Card-->
+
+---
+
+# <span id="testllm" style="color: #7F7FFF;">🚀 If you find these models useful</span>
+
+Help me test my **AI-Powered Quantum Network Monitor Assistant** with **quantum-ready security checks**:  
+
+👉 [Quantum Network Monitor](https://readyforquantum.com/?assistant=open&utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme)  
+
+
+The full Open Source Code for the Quantum Network Monitor Service available at my github repos ( repos with NetworkMonitor in the name) : [Source Code Quantum Network Monitor](https://github.com/Mungert69). You will also find the code I use to quantize the models if you want to do it yourself [GGUFModelBuilder](https://github.com/Mungert69/GGUFModelBuilder)
+
+💬 **How to test**:  
+ Choose an **AI assistant type**:  
+   - `TurboLLM` (GPT-4.1-mini)  
+   - `HugLLM` (Hugginface Open-source models)  
+   - `TestLLM` (Experimental CPU-only)  
+
+### **What I’m Testing**  
+I’m pushing the limits of **small open-source models for AI network monitoring**, specifically:  
+- **Function calling** against live network services  
+- **How small can a model go** while still handling:  
+  - Automated **Nmap security scans**  
+  - **Quantum-readiness checks**  
+  - **Network Monitoring tasks**  
+
+🟡 **TestLLM** – Current experimental model (llama.cpp on 2 CPU threads on huggingface docker space):  
+- ✅ **Zero-configuration setup**  
+- ⏳ 30s load time (slow inference but **no API costs**) . No token limited as the cost is low.
+- 🔧 **Help wanted!** If you’re into **edge-device AI**, let’s collaborate!  
+
+### **Other Assistants**  
+🟢 **TurboLLM** – Uses **gpt-4.1-mini** :
+- **It performs very well but unfortunatly OpenAI charges per token. For this reason tokens usage is limited. 
+- **Create custom cmd processors to run .net code on Quantum Network Monitor Agents**
+- **Real-time network diagnostics and monitoring**
+- **Security Audits**
+- **Penetration testing** (Nmap/Metasploit)  
+
+🔵 **HugLLM** – Latest Open-source models:  
+- 🌐 Runs on Hugging Face Inference API. Performs pretty well using the lastest models hosted on Novita.
+
+### 💡 **Example commands you could test**:  
+1. `"Give me info on my websites SSL certificate"`  
+2. `"Check if my server is using quantum safe encyption for communication"`  
+3. `"Run a comprehensive security audit on my server"`
+4. '"Create a cmd processor to .. (what ever you want)" Note you need to install a [Quantum Network Monitor Agent](https://readyforquantum.com/Download/?utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme) to run the .net code on. This is a very flexible and powerful feature. Use with caution!
+
+### Final Word
+
+I fund the servers used to create these model files, run the Quantum Network Monitor service, and pay for inference from Novita and OpenAI—all out of my own pocket. All the code behind the model creation and the Quantum Network Monitor project is [open source](https://github.com/Mungert69). Feel free to use whatever you find helpful.
+
+If you appreciate the work, please consider [buying me a coffee](https://www.buymeacoffee.com/mahadeva) ☕. Your support helps cover service costs and allows me to raise token limits for everyone.
+
+I'm also open to job opportunities or sponsorship.
+
+Thank you! 😊