From 567b2f226d0f78e8b3201185787f104d0d82218e Mon Sep 17 00:00:00 2001 From: Mungert Date: Wed, 24 Sep 2025 15:45:03 +0000 Subject: [PATCH] Super-squash history to reclaim storage --- .gitattributes | 52 ++++++ LFM2-2.6B-bf16.gguf | 3 + LFM2-2.6B-bf16_q8_0.gguf | 3 + LFM2-2.6B-f16_q8_0.gguf | 3 + LFM2-2.6B-imatrix.gguf | 3 + LFM2-2.6B-iq3_m.gguf | 3 + LFM2-2.6B-iq3_xxs.gguf | 3 + LFM2-2.6B-iq4_nl.gguf | 3 + LFM2-2.6B-iq4_xs.gguf | 3 + LFM2-2.6B-q3_k_m.gguf | 3 + LFM2-2.6B-q3_k_s.gguf | 3 + LFM2-2.6B-q4_0.gguf | 3 + LFM2-2.6B-q4_1.gguf | 3 + LFM2-2.6B-q4_k_m.gguf | 3 + LFM2-2.6B-q4_k_s.gguf | 3 + LFM2-2.6B-q5_0.gguf | 3 + LFM2-2.6B-q5_1.gguf | 3 + LFM2-2.6B-q8_0.gguf | 3 + README.md | 351 +++++++++++++++++++++++++++++++++++++++ 19 files changed, 454 insertions(+) create mode 100644 .gitattributes create mode 100644 LFM2-2.6B-bf16.gguf create mode 100644 LFM2-2.6B-bf16_q8_0.gguf create mode 100644 LFM2-2.6B-f16_q8_0.gguf create mode 100644 LFM2-2.6B-imatrix.gguf create mode 100644 LFM2-2.6B-iq3_m.gguf create mode 100644 LFM2-2.6B-iq3_xxs.gguf create mode 100644 LFM2-2.6B-iq4_nl.gguf create mode 100644 LFM2-2.6B-iq4_xs.gguf create mode 100644 LFM2-2.6B-q3_k_m.gguf create mode 100644 LFM2-2.6B-q3_k_s.gguf create mode 100644 LFM2-2.6B-q4_0.gguf create mode 100644 LFM2-2.6B-q4_1.gguf create mode 100644 LFM2-2.6B-q4_k_m.gguf create mode 100644 LFM2-2.6B-q4_k_s.gguf create mode 100644 LFM2-2.6B-q5_0.gguf create mode 100644 LFM2-2.6B-q5_1.gguf create mode 100644 LFM2-2.6B-q8_0.gguf create mode 100644 README.md diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..311eff7 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,52 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-f16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-bf16_q8_0.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q3_k_s.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q4_k_s.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q8_0.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q4_0.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q4_1.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q5_0.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-q5_1.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-iq3_xxs.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-iq3_m.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-iq4_xs.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-iq4_nl.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-imatrix.gguf filter=lfs diff=lfs merge=lfs -text +LFM2-2.6B-bf16.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/LFM2-2.6B-bf16.gguf b/LFM2-2.6B-bf16.gguf new file mode 100644 index 0000000..651680d --- /dev/null +++ b/LFM2-2.6B-bf16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28e226de0bd9a872fec8ae0a9335c0cf946d56e2afc7b05572f58acf0ea7bc4c +size 5141459296 diff --git a/LFM2-2.6B-bf16_q8_0.gguf b/LFM2-2.6B-bf16_q8_0.gguf new file mode 100644 index 0000000..ef870c9 --- /dev/null +++ b/LFM2-2.6B-bf16_q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:074fe6ffdd2c63a01d975545ba9830da616a67d070daeefcff26272135e0254f +size 3517477216 diff --git a/LFM2-2.6B-f16_q8_0.gguf b/LFM2-2.6B-f16_q8_0.gguf new file mode 100644 index 0000000..134aa5c --- /dev/null +++ b/LFM2-2.6B-f16_q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:122d0f36c3cda776857a78c27c7bb36fb3ac02843bc87fcd41d634d50a4ca3ff +size 3517477216 diff --git a/LFM2-2.6B-imatrix.gguf b/LFM2-2.6B-imatrix.gguf new file mode 100644 index 0000000..416f1d6 --- /dev/null +++ b/LFM2-2.6B-imatrix.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:734313544e7b6fff938e15cb231ec0d5644737a8a00f53eb26f70a1d98033d73 +size 2430912 diff --git a/LFM2-2.6B-iq3_m.gguf b/LFM2-2.6B-iq3_m.gguf new file mode 100644 index 0000000..4627144 --- /dev/null +++ b/LFM2-2.6B-iq3_m.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8dc3fa7dadb9f039dc13a97b0407d0510af6d4d6558a173a2dc051a392753aa4 +size 1214902944 diff --git a/LFM2-2.6B-iq3_xxs.gguf b/LFM2-2.6B-iq3_xxs.gguf new file mode 100644 index 0000000..4baaeb2 --- /dev/null +++ b/LFM2-2.6B-iq3_xxs.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:123fe0ab58a778ca7f39311c137bb4288136ea46b299eab56741a7720bc5f1f6 +size 1071723168 diff --git a/LFM2-2.6B-iq4_nl.gguf b/LFM2-2.6B-iq4_nl.gguf new file mode 100644 index 0000000..1df9d99 --- /dev/null +++ b/LFM2-2.6B-iq4_nl.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8b36199f467c886f9784bad7d9a45146dce0c8c803a2bd12e9f4f7c8548e1d5b +size 1448506016 diff --git a/LFM2-2.6B-iq4_xs.gguf b/LFM2-2.6B-iq4_xs.gguf new file mode 100644 index 0000000..f7d26cc --- /dev/null +++ b/LFM2-2.6B-iq4_xs.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db7ad5a67b2cea2370a51ba83e1855fa7d9329951f7a1dd61ea09867e5463651 +size 1407021728 diff --git a/LFM2-2.6B-q3_k_m.gguf b/LFM2-2.6B-q3_k_m.gguf new file mode 100644 index 0000000..66e5f7c --- /dev/null +++ b/LFM2-2.6B-q3_k_m.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5fcbc0b70156e231e33ca9eb3f4a333b602733ebe11de85f7033ae99738bc650 +size 1234432672 diff --git a/LFM2-2.6B-q3_k_s.gguf b/LFM2-2.6B-q3_k_s.gguf new file mode 100644 index 0000000..bf5e13c --- /dev/null +++ b/LFM2-2.6B-q3_k_s.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:df5880540b8571828dfb652d7b81059289a8f431ddc2ff781a38a118959ac145 +size 1217655456 diff --git a/LFM2-2.6B-q4_0.gguf b/LFM2-2.6B-q4_0.gguf new file mode 100644 index 0000000..69a3a7e --- /dev/null +++ b/LFM2-2.6B-q4_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:763c3ad96e9b64424100604e3569be49d8661d044d2f50f415e168e23cbbb693 +size 1448506016 diff --git a/LFM2-2.6B-q4_1.gguf b/LFM2-2.6B-q4_1.gguf new file mode 100644 index 0000000..26afab9 --- /dev/null +++ b/LFM2-2.6B-q4_1.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a2f613805dedbfcc16918ce664a7beb5ca5741a0074a0cce1659a979268a0e7c +size 1609069216 diff --git a/LFM2-2.6B-q4_k_m.gguf b/LFM2-2.6B-q4_k_m.gguf new file mode 100644 index 0000000..9cc2654 --- /dev/null +++ b/LFM2-2.6B-q4_k_m.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:41db09564ddc32e8b366c2e9147944a9f54fc83b5d8abce830af35fbd456ef89 +size 1546891936 diff --git a/LFM2-2.6B-q4_k_s.gguf b/LFM2-2.6B-q4_k_s.gguf new file mode 100644 index 0000000..3ae0162 --- /dev/null +++ b/LFM2-2.6B-q4_k_s.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f41bf32bf3a11908f900b373c9222ae9a0a19b71841b15bbbdd73037c3931ccd +size 1483502240 diff --git a/LFM2-2.6B-q5_0.gguf b/LFM2-2.6B-q5_0.gguf new file mode 100644 index 0000000..89cea8d --- /dev/null +++ b/LFM2-2.6B-q5_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:874c626f261d5ddf091567141fab27f042d5a60e68544ff0d34f96f3094d392b +size 1769632416 diff --git a/LFM2-2.6B-q5_1.gguf b/LFM2-2.6B-q5_1.gguf new file mode 100644 index 0000000..af7ae83 --- /dev/null +++ b/LFM2-2.6B-q5_1.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a01b664f9f7db9fee159ed63793ab3d6a1f031c6e3dd39b5c621e4348c7f4242 +size 1930195616 diff --git a/LFM2-2.6B-q8_0.gguf b/LFM2-2.6B-q8_0.gguf new file mode 100644 index 0000000..33f9904 --- /dev/null +++ b/LFM2-2.6B-q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c4558fd0adcd86bda2dc414acc45c0348eca29662d33f50cbd475fba7e612567 +size 2733011296 diff --git a/README.md b/README.md new file mode 100644 index 0000000..198f39c --- /dev/null +++ b/README.md @@ -0,0 +1,351 @@ +--- +library_name: transformers +license: other +license_name: lfm1.0 +license_link: LICENSE +language: +- en +- ar +- zh +- fr +- de +- ja +- ko +- es +pipeline_tag: text-generation +tags: +- liquid +- lfm2 +- edge +--- + +# LFM2-2.6B GGUF Models + + +## Model Generation Details + +This model was generated using [llama.cpp](https://github.com/ggerganov/llama.cpp) at commit [`f505bd83`](https://github.com/ggerganov/llama.cpp/commit/f505bd83ca7a43c4585ff3d59135e77eae9c793b). + + + + + + +--- + + + Click here to get info on choosing the right GGUF model format + + +--- + + + + + + +
+
+ Liquid AI +
+
+ + + Playground + + + + + + + + + + + + + + + + + + + Leap + + + + + + + + + + + + + + +
+
+ +# LFM2-2.6B + +LFM2 is a new generation of hybrid models developed by [Liquid AI](https://www.liquid.ai/), specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency. + +We're releasing the weights of four post-trained checkpoints with 350M, 700M, 1.2B, and 2.6 parameters. They provide the following key features to create AI-powered edge applications: + +* **Fast training & inference** – LFM2 achieves 3x faster training compared to its previous generation. It also benefits from 2x faster decode and prefill speed on CPU compared to Qwen3. +* **Best performance** – LFM2 outperforms similarly-sized models across multiple benchmark categories, including knowledge, mathematics, instruction following, and multilingual capabilities. +* **New architecture** – LFM2 is a new hybrid Liquid model with multiplicative gates and short convolutions. +* **Flexible deployment** – LFM2 runs efficiently on CPU, GPU, and NPU hardware for flexible deployment on smartphones, laptops, or vehicles. + +Find more information about LFM2 in our [blog post](https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models). + +## πŸ“„ Model details + +Due to their small size, **we recommend fine-tuning LFM2 models on narrow use cases** to maximize performance. +They are particularly suited for agentic tasks, data extraction, RAG, creative writing, and multi-turn conversations. +However, we do not recommend using them for tasks that are knowledge-intensive or require programming skills. + +| Property | [**LFM2-350M**](https://huggingface.co/LiquidAI/LFM2-350M) | [**LFM2-700M**](https://huggingface.co/LiquidAI/LFM2-700M) | [**LFM2-1.2B**](https://huggingface.co/LiquidAI/LFM2-1.2B) | [**LFM2-2.6B**](https://huggingface.co/LiquidAI/LFM2-2.6B) | +| ------------------- | ----------------------------- | ----------------------------- | ----------------------------- | ----------------------------- | +| **Parameters** | 354,483,968 | 742,489,344 | 1,170,340,608 | 2,569,272,320 | +| **Layers** | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) | 16 (10 conv + 6 attn) | 30 (22 conv + 8 attn) | +| **Context length** | 32,768 tokens | 32,768 tokens | 32,768 tokens | 32,768 tokens | +| **Vocabulary size** | 65,536 | 65,536 | 65,536 | 65,536 | +| **Precision** | bfloat16 | bfloat16 | bfloat16 | bfloat16 | +| **Training budget** | 10 trillion tokens | 10 trillion tokens | 10 trillion tokens | 10 trillion tokens | +| **License** | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 | LFM Open License v1.0 + +**Supported languages**: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. + +**Generation parameters**: We recommend the following parameters: +* `temperature=0.3` +* `min_p=0.15` +* `repetition_penalty=1.05` + +**Chat template**: LFM2 uses a ChatML-like chat template as follows: + +``` +<|startoftext|><|im_start|>system +You are a helpful assistant trained by Liquid AI.<|im_end|> +<|im_start|>user +What is C. elegans?<|im_end|> +<|im_start|>assistant +It's a tiny nematode that lives in temperate soil environments.<|im_end|> +``` + +You can automatically apply it using the dedicated [`.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#applychattemplate) function from Hugging Face transformers. + +**Tool use**: It consists of four main steps: +1. **Function definition**: LFM2 takes JSON function definitions as input (JSON objects between `<|tool_list_start|>` and `<|tool_list_end|>` special tokens), usually in the system prompt +2. **Function call**: LFM2 writes Pythonic function calls (a Python list between `<|tool_call_start|>` and `<|tool_call_end|>` special tokens), as the assistant answer. +3. **Function execution**: The function call is executed and the result is returned (string between `<|tool_response_start|>` and `<|tool_response_end|>` special tokens), as a "tool" role. +4. **Final answer**: LFM2 interprets the outcome of the function call to address the original user prompt in plain text. + +Here is a simple example of a conversation using tool use: + +``` +<|startoftext|><|im_start|>system +List of tools: <|tool_list_start|>[{"name": "get_candidate_status", "description": "Retrieves the current status of a candidate in the recruitment process", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "Unique identifier for the candidate"}}, "required": ["candidate_id"]}}]<|tool_list_end|><|im_end|> +<|im_start|>user +What is the current status of candidate ID 12345?<|im_end|> +<|im_start|>assistant +<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>Checking the current status of candidate ID 12345.<|im_end|> +<|im_start|>tool +<|tool_response_start|>{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}<|tool_response_end|><|im_end|> +<|im_start|>assistant +The candidate with ID 12345 is currently in the "Interview Scheduled" stage for the position of Clinical Research Associate, with an interview date set for 2023-11-20.<|im_end|> +``` + +**Architecture**: Hybrid model with multiplicative gates and short convolutions: 10 double-gated short-range LIV convolution blocks and 6 grouped query attention (GQA) blocks. + +**Pre-training mixture**: Approximately 75% English, 20% multilingual, and 5% code data sourced from the web and licensed materials. + +**Training approach**: +* Very large-scale SFT on 50% downstream tasks, 50% general domains +* Custom DPO with length normalization and semi-online datasets +* Iterative model merging + +## πŸƒ How to run LFM2 + +### 1. Transformers + +To run LFM2, you need to install Hugging Face [`transformers`](https://github.com/huggingface/transformers) v4.55 or a more recent version as follows: + +```bash +pip install -U transformers +``` + +Here is an example of how to generate an answer with transformers in Python: + +```python +from transformers import AutoModelForCausalLM, AutoTokenizer + +# Load model and tokenizer +model_id = "LiquidAI/LFM2-2.6B" +model = AutoModelForCausalLM.from_pretrained( + model_id, + device_map="auto", + torch_dtype="bfloat16", +# attn_implementation="flash_attention_2" <- uncomment on compatible GPU +) +tokenizer = AutoTokenizer.from_pretrained(model_id) + +# Generate answer +prompt = "What is C. elegans?" +input_ids = tokenizer.apply_chat_template( + [{"role": "user", "content": prompt}], + add_generation_prompt=True, + return_tensors="pt", + tokenize=True, +).to(model.device) + +output = model.generate( + input_ids, + do_sample=True, + temperature=0.3, + min_p=0.15, + repetition_penalty=1.05, + max_new_tokens=512, +) + +print(tokenizer.decode(output[0], skip_special_tokens=False)) + +# <|startoftext|><|im_start|>user +# What is C. elegans?<|im_end|> +# <|im_start|>assistant +# C. elegans, also known as Caenorhabditis elegans, is a small, free-living +# nematode worm (roundworm) that belongs to the phylum Nematoda. +``` + +You can directly run and test the model with this [Colab notebook](https://colab.research.google.com/drive/1_q3jQ6LtyiuPzFZv7Vw8xSfPU5FwkKZY?usp=sharing). + +### 2. vLLM + +You need to install [`vLLM`](https://github.com/vllm-project/vllm) v0.10.2 or a more recent version as follows: + +```bash +uv pip install vllm==0.10.2 --extra-index-url https://wheels.vllm.ai/0.10.2/ --torch-backend=auto +``` + +Here is an example of how to use it for inference: + +```python +from vllm import LLM, SamplingParams + +prompts = [ + "What is C. elegans?", + "Say hi in JSON format", + "Define AI in Spanish" +] +sampling_params = SamplingParams(temperature=0.3, min_p=0.15, repetition_penalty=1.05) + +llm = LLM(model="LiquidAI/LFM2-2.6B") + +outputs = llm.generate(prompts, sampling_params) + +for output in outputs: + prompt = output.prompt + generated_text = output.outputs[0].text + print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}") +``` + +### 3. llama.cpp + +You can run LFM2 with llama.cpp using its [GGUF checkpoint](https://huggingface.co/LiquidAI/LFM2-2.6B-GGUF). Find more information in the model card. + +## πŸ”§ How to fine-tune LFM2 + +We recommend fine-tuning LFM2 models on your use cases to maximize performance. + +| Notebook | Description | Link | +|-------|------|------| +| SFT (Unsloth) | Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using Unsloth. | Colab link | +| SFT (Axolotl) | Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using Axolotl. | Colab link | +| SFT (TRL) | Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using TRL. | Colab link | +| DPO (TRL) | Preference alignment with Direct Preference Optimization (DPO) using TRL. | Colab link | + +## πŸ“ˆ Performance + +LFM2 outperforms similar-sized models across different evaluation categories. We only report scores using instruct variants and non-thinking modes for consistency. + +![image/png](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/xQNi_QoqAWBB1vg2XR3Rh.png) + +| Model | MMLU | GPQA | IFEval | IFBench | GSM8K | MGSM | MMMLU | +| ---------------------- | ----- | ----- | ------ | ------- | ----- | ----- | ----- | +| LFM2-2.6B | 64.42 | 26.57 | 79.56 | 22.19 | 82.41 | 74.32 | 55.39 | +| Llama-3.2-3B-Instruct | 60.35 | 30.6 | 71.43 | 20.78 | 75.21 | 61.68 | 47.92 | +| SmolLM3-3B | 59.84 | 26.31 | 72.44 | 17.93 | 81.12 | 68.72 | 50.02 | +| gemma-3-4b-it | 58.35 | 29.51 | 76.85 | 23.53 | 89.92 | 87.28 | 50.14 | +| Qwen3-4B-Instruct-2507 | 72.25 | 34.85 | 85.62 | 30.28 | 68.46 | 81.76 | 60.67 | + +## πŸ“¬ Contact + +If you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact). + + + +--- + +# πŸš€ If you find these models useful + +Help me test my **AI-Powered Quantum Network Monitor Assistant** with **quantum-ready security checks**: + +πŸ‘‰ [Quantum Network Monitor](https://readyforquantum.com/?assistant=open&utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme) + + +The full Open Source Code for the Quantum Network Monitor Service available at my github repos ( repos with NetworkMonitor in the name) : [Source Code Quantum Network Monitor](https://github.com/Mungert69). You will also find the code I use to quantize the models if you want to do it yourself [GGUFModelBuilder](https://github.com/Mungert69/GGUFModelBuilder) + +πŸ’¬ **How to test**: + Choose an **AI assistant type**: + - `TurboLLM` (GPT-4.1-mini) + - `HugLLM` (Hugginface Open-source models) + - `TestLLM` (Experimental CPU-only) + +### **What I’m Testing** +I’m pushing the limits of **small open-source models for AI network monitoring**, specifically: +- **Function calling** against live network services +- **How small can a model go** while still handling: + - Automated **Nmap security scans** + - **Quantum-readiness checks** + - **Network Monitoring tasks** + +🟑 **TestLLM** – Current experimental model (llama.cpp on 2 CPU threads on huggingface docker space): +- βœ… **Zero-configuration setup** +- ⏳ 30s load time (slow inference but **no API costs**) . No token limited as the cost is low. +- πŸ”§ **Help wanted!** If you’re into **edge-device AI**, let’s collaborate! + +### **Other Assistants** +🟒 **TurboLLM** – Uses **gpt-4.1-mini** : +- **It performs very well but unfortunatly OpenAI charges per token. For this reason tokens usage is limited. +- **Create custom cmd processors to run .net code on Quantum Network Monitor Agents** +- **Real-time network diagnostics and monitoring** +- **Security Audits** +- **Penetration testing** (Nmap/Metasploit) + +πŸ”΅ **HugLLM** – Latest Open-source models: +- 🌐 Runs on Hugging Face Inference API. Performs pretty well using the lastest models hosted on Novita. + +### πŸ’‘ **Example commands you could test**: +1. `"Give me info on my websites SSL certificate"` +2. `"Check if my server is using quantum safe encyption for communication"` +3. `"Run a comprehensive security audit on my server"` +4. '"Create a cmd processor to .. (what ever you want)" Note you need to install a [Quantum Network Monitor Agent](https://readyforquantum.com/Download/?utm_source=huggingface&utm_medium=referral&utm_campaign=huggingface_repo_readme) to run the .net code on. This is a very flexible and powerful feature. Use with caution! + +### Final Word + +I fund the servers used to create these model files, run the Quantum Network Monitor service, and pay for inference from Novita and OpenAIβ€”all out of my own pocket. All the code behind the model creation and the Quantum Network Monitor project is [open source](https://github.com/Mungert69). Feel free to use whatever you find helpful. + +If you appreciate the work, please consider [buying me a coffee](https://www.buymeacoffee.com/mahadeva) β˜•. Your support helps cover service costs and allows me to raise token limits for everyone. + +I'm also open to job opportunities or sponsorship. + +Thank you! 😊