初始化项目,由ModelHub XC社区提供模型

Model: Mungert/LFM2.5-8B-A1B-GGUF
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-17 15:32:17 +08:00
commit 7509cbbc1b
29 changed files with 488 additions and 0 deletions

62
.gitattributes vendored Normal file
View File

@@ -0,0 +1,62 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-f16.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-f16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-bf16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q3_k_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q4_k_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q5_k_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q6_k_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q6_k_m.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q4_0.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q4_1.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q4_0_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q4_1_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q5_0.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q5_1.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q5_0_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-q5_1_l.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-iq3_xs.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-iq3_xxs.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-iq3_s.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-iq3_m.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-iq4_xs.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-mxfp4_moe.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-imatrix.gguf filter=lfs diff=lfs merge=lfs -text
LFM2.5-8B-A1B-bf16.gguf filter=lfs diff=lfs merge=lfs -text

3
LFM2.5-8B-A1B-bf16.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2b471037695ef5908cb88065c7ebb23fe285a07950d7433fabf0c865273728ab
size 16947260640

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:674967ea5846e244aaf9c8d720834b9b53b6c126d01d701297c8a9cead386c40
size 15278058720

3
LFM2.5-8B-A1B-f16.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:054a4233007c76e31ee75a99bdb9a7748d29099c98c5b43c17964b62e940fc1f
size 16947260640

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:34f25fc3cfb92d31d371826772b229c861f6986c770401e192b24e12180b9ee4
size 15278058720

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8ae78c675c5a05745b141c4115a7d30c928a03b79fd11c32df3ca1096a5bc49c
size 17375264

3
LFM2.5-8B-A1B-iq3_m.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a5ae4abaaf366468ccd26e7b3d781052fc423938a5a3d26f31880caf9ecce9e8
size 4291800608

3
LFM2.5-8B-A1B-iq3_s.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1db5b78344de6706118b84280ef3993c15dbd9410e6d75f9ce60bee69a8b530f
size 4291800608

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:27a82461476865f19db1c17527f1d7d07a722ade6c76e0142434ed05bb4e6dd7
size 3987533344

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c5d3672095e1c66ac355a308920c07223dc30025c525f60b6b49062c57866396
size 3970625056

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f0138fafdad549b69ea6c064a7e96304c4419776f12e013a8be07bd94bcca602
size 4588301856

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5374ff486246d4d52c86e3e94957d19d72cc04015cfe8fcae1026adfbfeb2c78
size 4892437728

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:94b5104aace0b8cc11215ecc129dfa0d2bd118403de2fb89c763bae910758754
size 4306300448

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e79a2b3cd1c75f1d0a152870189cb563ff6c5c326d42678b934eddc7c9057f65
size 4242812448

3
LFM2.5-8B-A1B-q4_0.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c61d5e4e0ae39e26a942adc1170f3ed779b8899bf2a9fe88b68f7775626c2664
size 4777094688

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d46cd56641f02fcb6ba9fa5ed72035195412734d1c5be732a4466bfa3b737105
size 4908166688

3
LFM2.5-8B-A1B-q4_1.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4cd27656180bf37138ce4d52f10c5a8a17893d6fd6923ff6e975be244b23edf4
size 5306232352

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:db52de524a2ef4434b54f20b73be60650419e4e7e1b8c8ad64e604dc19cfa8f5
size 5420920352

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ad568ac02b76c6e7a718a3bddc08e48d7d1e86262b1014756f7474916566750a
size 5278584352

3
LFM2.5-8B-A1B-q5_0.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4243df7ff259505b0ea2d05a939370cad606ed3bc6af6937ed2ec36153ae6d8e
size 5835370016

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:325fc9c5f294d416b4d90376c021b18271cdc7b09d56d479298b2a5c64a8b568
size 5933674016

3
LFM2.5-8B-A1B-q5_1.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1b4b68d51b1427f36431770f7ce1663a3403f537a25b8d52ff164fe82a2f2b9d
size 6364507680

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c67c57be9648b5cf79f2c78851fa3887c616d5a70142e389378e38d3c4d18fb6
size 6446427680

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c1789a2757c9d0b60ca8729b603503d74d52dd45a81959ae90f7cf0961ccba45
size 6227660320

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5e626cdae134c0f41319d107e5a02274e19fafea75508359456dd202d7bb5dac
size 6164172320

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:355c94bf256d89c001dadffdc459eaf3b65fd53803f82b1c950ec7b458159d84
size 7023275552

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:09e8acad5e7c8e4fa40d38b32fd35387e15707f8e12287244a4153557aa3a367
size 6959787552

3
LFM2.5-8B-A1B-q8_0.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:33ab3b8ce6a964fb8ebac89360c9b3cf72c4fa418d5e4c0a94d46883124d5c02
size 9010195680

345
README.md Normal file
View File

@@ -0,0 +1,345 @@
---
library_name: transformers
license: other
license_name: lfm1.0
license_link: LICENSE
language:
- en
- ar
- zh
- fr
- de
- ja
- ko
- es
- pt
pipeline_tag: text-generation
tags:
- liquid
- lfm2.5
- edge
base_model: LiquidAI/LFM2.5-8B-A1B-Base
---
# <span style="color: #7FFF7F;">LFM2.5-8B-A1B GGUF Models</span>
## <span style="color: #7F7FFF;">Model Generation Details</span>
This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cpp) at commit [`94a220cd6`](https://github.com/ggerganov/llama.cpp/commit/94a220cd6745e6e3f8de62870b66fd5b9bc92700).
---
<a href="https://readyforquantum.com/huggingface_gguf_selection_guide.html" style="color: #7FFF7F;">
Click here to get info on choosing the right GGUF model format
</a>
---
<!--Begin Original Model Card-->
<div align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/2b08LKpev0DNEk6DlnWkY.png"
alt="Liquid AI"
style="width: 100%; max-width: 100%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
/>
<div style="display: flex; justify-content: center; gap: 0.5em; margin-bottom: 1em;">
<a href="https://playground.liquid.ai/"><strong>Try LFM</strong></a>
<a href="https://docs.liquid.ai/lfm/getting-started/welcome"><strong>Docs</strong></a>
<a href="https://leap.liquid.ai/"><strong>LEAP</strong></a>
<a href="https://discord.com/invite/liquid-ai"><strong>Discord</strong></a>
</div>
</div>
# LFM2.5-8B-A1B
LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.
- **On-device personal assistant**: Designed to power real-life applications, chaining tool calls, and following complex instructions on all devices.
- **Compressed performance**: Competitive with much larger dense and MoE models on instruction following and agentic tasks.
- **Unmatched throughput**: Fastest in its size class on both CPU and GPU inference, with day-one support for llama.cpp, MLX, vLLM, and SGLang.
Find more information about LFM2.5-8B-A1B in our [blog post](https://www.liquid.ai/blog/lfm2-5-8b-a1b).
![image](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/qUZVGkns1bg3sZUShBbhv.png)
**AA-Omniscience Index (higher is better) rewards correct answers and penalizes hallucinations. Scores range from -100 to 100. See more results on [Artificial Analysis](https://artificialanalysis.ai/evaluations/omniscience).*
## 🗒️ Model Details
| Model | Parameters | Description |
| --- | --- | --- |
| [LFM2.5-8B-A1B-Base](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-Base) | 8.3B total / 1.5B active | Pre-trained base model for fine-tuning |
| [**LFM2.5-8B-A1B**](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B) | 8.3B total / 1.5B active | Reasoning-tuned general-purpose model |
LFM2.5-8B-A1B is a general-purpose text-only model with the following features:
- **Total parameters**: 8.3B
- **Active parameters**: 1.5B
- **Number of layers**: 24 (18 double-gated LIV conv + 6 GQA)
- **Training budget**: 38 trillion tokens
- **Context length**: 128,000
- **Vocabulary size**: 128,000
- **Languages**: English, Arabic, Chinese, French, German, Italian, Japanese, Korean, Portuguese, Spanish
- **Generation parameters**: We recommend the following parameters:
- `temperature: 0.2`
- `top_k: 80`
- `repetition_penalty: 1.05`
| Model | Description |
| --- | --- |
| [**LFM2.5-8B-A1B**](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers, vLLM, and SGLang. |
| [LFM2.5-8B-A1B-GGUF](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for edge inference and local deployment. |
| [LFM2.5-8B-A1B-ONNX](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-ONNX) | ONNX Runtime format for cross-platform deployment. |
| [LFM2.5-8B-A1B-MLX](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-MLX-8bit) | MLX format for Apple Silicon. Optimized for fast inference on Mac devices. |
We recommend using LFM2.5-8B-A1B for agentic workflows, tool use, structured outputs, multilingual assistants, and on-device personal-assistant applications. It is not the best fit for heavy programming or knowledge-intensive question answering without retrieval.
### Chat Template
LFM2.5 uses a ChatML-like format. See the [Chat Template documentation](https://docs.liquid.ai/lfm/key-concepts/chat-template) for details. Example:
```
<|startoftext|><|im_start|>system
You are a helpful assistant trained by Liquid AI.<|im_end|>
<|im_start|>user
What is C. elegans?<|im_end|>
<|im_start|>assistant
```
Because LFM2.5-8B-A1B is a reasoning model, assistant turns contain an explicit chain of thought before the final answer. You can use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#using-applychattemplate) to format your messages automatically.
### Tool Use
LFM2.5 supports function calling in four steps:
1. **Function definition**: Provide the list of tools as a JSON object in the system prompt, or use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_extras#passing-tools) with `tools=...`.
2. **Function call**: By default, LFM2.5 writes Pythonic function calls (a Python list between `<|tool_call_start|>` and `<|tool_call_end|>` special tokens), as the assistant answer. You can override this behavior by asking the model to output JSON function calls in the system prompt.
3. **Function execution**: Execute the call and return the result with the `tool` role.
4. **Final answer**: LFM2.5 interprets the tool output and returns a plain-text answer addressing the original prompt.
See the [Tool Use documentation](https://docs.liquid.ai/lfm/key-concepts/tool-use) for the full guide. Example:
```
<|startoftext|><|im_start|>system
List of tools: [{"name": "get_candidate_status", "description": "Retrieves the current status of a candidate in the recruitment process", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "Unique identifier for the candidate"}}, "required": ["candidate_id"]}}]<|im_end|>
<|im_start|>user
What is the current status of candidate ID 12345?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>Checking the current status of candidate ID 12345.<|im_end|>
<|im_start|>tool
[{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|>
<|im_start|>assistant
The candidate with ID 12345 is currently in the "Interview Scheduled" stage for the position of Clinical Research Associate, with an interview date set for 2023-11-20.<|im_end|>
```
## 🏃 Inference
LFM2.5-8B-A1B is supported by many inference frameworks. See the [Inference documentation](https://docs.liquid.ai/lfm/inference/transformers) for the full list.
| Name | Description | Docs | Notebook |
|------|-------------|------|:--------:|
| [Transformers](https://github.com/huggingface/transformers) | Simple inference with direct access to model internals. | <a href="https://docs.liquid.ai/lfm/inference/transformers">Link</a> | <a href="https://colab.research.google.com/drive/1_q3jQ6LtyiuPzFZv7Vw8xSfPU5FwkKZY?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [vLLM](https://github.com/vllm-project/vllm) | High-throughput production deployments with GPU. | <a href="https://docs.liquid.ai/lfm/inference/vllm">Link</a> | <a href="https://colab.research.google.com/drive/1VfyscuHP8A3we_YpnzuabYJzr5ju0Mit?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [llama.cpp](https://github.com/ggml-org/llama.cpp) | Cross-platform inference with CPU offloading. | <a href="https://docs.liquid.ai/lfm/inference/llama-cpp">Link</a> | <a href="https://colab.research.google.com/drive/1ohLl3w47OQZA4ELo46i5E4Z6oGWBAyo8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [MLX](https://github.com/ml-explore/mlx) | Apple's machine learning framework optimized for Apple Silicon. | <a href="https://docs.liquid.ai/lfm/inference/mlx">Link</a> | — |
| [LM Studio](https://lmstudio.ai/) | Desktop application for running LLMs locally. | <a href="https://docs.liquid.ai/lfm/inference/lm-studio">Link</a> | — |
Quick start with Transformers (compatible with `transformers>=5.0.0`):
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_id = "LiquidAI/LFM2.5-8B-A1B"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
dtype="bfloat16",
# attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
prompt = "What is C. elegans?"
input_ids = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
)["input_ids"].to(model.device)
output = model.generate(
input_ids,
do_sample=True,
temperature=0.2,
top_k=80,
repetition_penalty=1.05,
max_new_tokens=8192,
streamer=streamer,
)
```
## 🔧 Fine-Tuning
We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.
| Name | Description | Docs | Notebook |
|------|-------------|------|----------|
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for text completion. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/10fm7eNMezs-DSn36mF7vAsNYlOsx9YZO?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for translation. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1gaP8yTle2_v35Um8Gpu9239fqbU7UgY8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| SFT ([Unsloth](https://github.com/unslothai/unsloth)) | Supervised Fine-Tuning with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1vGRg4ksRj__6OLvXkHhvji_Pamv801Ss?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| SFT ([TRL](https://github.com/huggingface/trl)) | Supervised Fine-Tuning with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1j5Hk_SyBb2soUsuhU0eIEA9GwLNRnElF?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| DPO ([TRL](https://github.com/huggingface/trl)) | Direct Preference Optimization with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1MQdsPxFHeZweGsNx4RH7Ia8lG8PiGE1t?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| GRPO ([Unsloth](https://github.com/unslothai/unsloth)) | GRPO with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1mIikXFaGvcW4vXOZXLbVTxfBRw_XsXa5?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| GRPO ([TRL](https://github.com/huggingface/trl)) | GRPO with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/github/Liquid4All/cookbook/blob/main/finetuning/notebooks/grpo_for_verifiable_tasks.ipynb"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
## 📊 Performance
### Improvements over LFM2-8B-A1B
Thanks to reasoning, scaled-up pre-training, and large-scale RL, LFM2.5-8B-A1B improves over its predecessor across the board:
| Benchmark | LFM2-8B-A1B | LFM2.5-8B-A1B | Δ |
| :--- | ---: | ---: | ---: |
| AA-Omniscience Index | -78.42 | -24.70 | +53.62 |
| AA-Omniscience Accuracy | 7.33 | 8.67 | +1.34 |
| AA-Omniscience Non-Hallucination Rate | 7.46 | 63.47 | +56.01 |
| IFEval | 79.44 | 91.84 | +12.40 |
| IFBench | 26.00 | 56.47 | +30.47 |
| Multi-IF | 58.54 | 79.93 | +21.39 |
| MATH500 | 74.80 | 88.76 | +13.96 |
| AIME25 | 20.00 | 42.53 | +22.53 |
| BFCLv3 | 45.07 | 64.36 | +19.29 |
| BFCLv4 | 25.52 | 48.50 | +22.98 |
| Tau² Telecom | 13.60 | 88.07 | +74.47 |
| Tau² Retail | 7.02 | 39.82 | +32.80 |
### Knowledge and instruction following
| Model | Parameters | AA-Omni. Index | AA-Omni. Accuracy | AA-Omni. Non-Halluc. | IFEval | IFBench | Multi-IF |
| :--- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| LFM2.5-8B-A1B | 8B/A1B | -24.70 | 8.67 | 63.47 | 91.84 | 56.47 | 79.93 | |
| Granite-4.0-H-Tiny | 7B/A1B | -75.50 | 9.37 | 6.38 | 82.23 | 21.28 | 59.00 | |
| Qwen3.5-4B | 4B | -51.53 | 17.20 | 16.99 | 87.80 | 50.38 | 67.43 | |
| Qwen3-30B-A3B-Thinking-2507 | 30.5B/3.3B | -51.31 | 18.80 | 13.87 | 90.82 | 51.11 | 79.04 | |
| Gemma-4-E2B-IT | 5.1B | -72 | 7.00 | 15.05 | 82.93 | 33.53 | 69.70 | |
| Gemma-4-E4B-IT | 8B | -50.67 | 8.10 | 36.06 | 87.74 | 39.48 | 77.58 | |
| Gemma-4-26B-A4B-IT | 26B/4B | -62.07 | 14.37 | 10.75 | 91.40 | 47.25 | 82.06 | |
| gpt-oss-20b | 21B/3.6B | -49.17 | 14.57 | 24.50 | 86.73 | 58.65 | 76.64 | |
### Math and agentic workflows
| Model | Parameters | MATH500 | AIME25 | AIME26 | BFCLv3 | BFCLv4 | Tau² Telecom | Tau² Retail |
|---|---|---|---|---|---|---|---|---|
| LFM2.5-8B-A1B | 8B/A1B | 88.76 | 42.53 | 50.00 | 64.79 | 49.73 | 88.07 | 39.82 |
| Granite-4.0-H-Tiny | 7B/A1B | 59.20 | 4.93 | 3.33 | 56.89 | 28.52 | 16.67 | 18.42 |
| Qwen3.5-4B | 4B | 80.76 | 54.28 | 58.33 | 71.06 | 54.01 | 87.72 | 71.93 |
| Qwen3-30B-A3B-Thinking-2507 | 30.5B/3.3B | 86.48 | 71.67 | 66.67 | 73.39 | 50.53 | 21.93 | 56.14 |
| Gemma-4-E2B-IT | 5.1B | 64.00 | 26 | 30 | 56.44 | 31.91 | 22.37 | 18.95 |
| Gemma-4-E4B-IT | 8B | 65.00 | 34.33 | 40.67 | 57.31 | 33.92 | 26.75 | 42.11 |
### CPU Inference
![image](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/yWAChLNCguGTl9lXBL47p.png)
### GPU Inference
LFM2.5-8B-A1B is the fastest model in its size class, reaching **18.5K output tokens per second at high concurrency**, over 1.6B tokens per day on a single H100.
![image](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/LX3oIXQeDm51eaLQs64an.png)
## 📬 Contact
- Got questions or want to connect? [Join our Discord community](https://discord.com/invite/liquid-ai).
- If you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact).
## Citation
```bibtex
@article{liquidAI20268BA1B,
author = {Liquid AI},
title = {LFM2.5-8B-A1B: Personal Assistant On Your Laptop},
journal = {Liquid AI Blog},
year = {2026},
note = {www.liquid.ai/blog/lfm2-5-8b-a1b},
}
```
```bibtex
@article{liquidai2025lfm2,
title = {LFM2 Technical Report},
author = {Liquid AI},
journal = {arXiv preprint arXiv:2511.23404},
year = {2025}
}
```
<!--End Original Model Card-->
---
# <span id="testllm" style="color: #7F7FFF;">🚀 If you find these models useful</span>
Help me test my **AI-Powered Quantum Network Monitor Assistant** with **quantum-ready security checks**:
👉 [Quantum Network Monitor](https://readyforquantum.com/?assistant=open&utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme)
The full Open Source Code for the Quantum Network Monitor Service available at my github repos ( repos with NetworkMonitor in the name) : [Source Code Quantum Network Monitor](https://github.com/Mungert69). You will also find the code I use to quantize the models if you want to do it yourself [GGUFModelBuilder](https://github.com/Mungert69/GGUFModelBuilder)
💬 **How to test**:
Choose an **AI assistant type**:
- `TurboLLM` (GPT-4.1-mini)
- `HugLLM` (Hugginface Open-source models)
- `TestLLM` (Experimental CPU-only)
### **What Im Testing**
Im pushing the limits of **small open-source models for AI network monitoring**, specifically:
- **Function calling** against live network services
- **How small can a model go** while still handling:
- Automated **Nmap security scans**
- **Quantum-readiness checks**
- **Network Monitoring tasks**
🟡 **TestLLM** Current experimental model (llama.cpp on 2 CPU threads on huggingface docker space):
-**Zero-configuration setup**
- ⏳ 30s load time (slow inference but **no API costs**) . No token limited as the cost is low.
- 🔧 **Help wanted!** If youre into **edge-device AI**, lets collaborate!
### **Other Assistants**
🟢 **TurboLLM** Uses **gpt-4.1-mini** :
- **It performs very well but unfortunatly OpenAI charges per token. For this reason tokens usage is limited.
- **Create custom cmd processors to run .net code on Quantum Network Monitor Agents**
- **Real-time network diagnostics and monitoring**
- **Security Audits**
- **Penetration testing** (Nmap/Metasploit)
🔵 **HugLLM** Latest Open-source models:
- 🌐 Runs on Hugging Face Inference API. Performs pretty well using the lastest models hosted on Novita.
### 💡 **Example commands you could test**:
1. `"Give me info on my websites SSL certificate"`
2. `"Check if my server is using quantum safe encyption for communication"`
3. `"Run a comprehensive security audit on my server"`
4. '"Create a cmd processor to .. (what ever you want)" Note you need to install a [Quantum Network Monitor Agent](https://readyforquantum.com/Download/?utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme) to run the .net code on. This is a very flexible and powerful feature. Use with caution!
### Final Word
I fund the servers used to create these model files, run the Quantum Network Monitor service, and pay for inference from Novita and OpenAI—all out of my own pocket. All the code behind the model creation and the Quantum Network Monitor project is [open source](https://github.com/Mungert69). Feel free to use whatever you find helpful.
If you appreciate the work, please consider [buying me a coffee](https://www.buymeacoffee.com/mahadeva) ☕. Your support helps cover service costs and allows me to raise token limits for everyone.
I'm also open to job opportunities or sponsorship.
Thank you! 😊