commit 1e3bc228ccd504c17e912d6cf1753933fbd7f9f4 Author: ModelHub XC Date: Sun Apr 19 12:04:17 2026 +0800 初始化项目,由ModelHub XC社区提供模型 Model: Kelexine/LFM2.5-1.2B-Thinking-GGUF Source: Original Platform diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..55d563b --- /dev/null +++ b/.gitattributes @@ -0,0 +1,36 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +LFM2.5-1.2B-Thinking-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/LFM2.5-1.2B-Thinking-Q8_0.gguf b/LFM2.5-1.2B-Thinking-Q8_0.gguf new file mode 100644 index 0000000..4ead6b7 --- /dev/null +++ b/LFM2.5-1.2B-Thinking-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2948e3cec54adbadfed912c874827424235d89d173e939ba0ee4456f0b9651a5 +size 1246254048 diff --git a/README.md b/README.md new file mode 100644 index 0000000..62be4b9 --- /dev/null +++ b/README.md @@ -0,0 +1,121 @@ +--- +license: other +license_name: lfm-1.0 +license_link: https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking/blob/main/LICENSE +language: +- en +- ar +- zh +- fr +- de +- ja +- ko +- es +pipeline_tag: text-generation +tags: +- gguf +- llama.cpp +- quantized +- q8_0 +- liquid-ai +- lfm +- lfm2 +- conversational +base_model: LiquidAI/LFM2.5-1.2B-Thinking +--- + +# LFM 2.5 1.2B Thinking (GGUF) + +## Description + +This repository contains the **GGUF** quantized version of [LiquidAI/LFM2.5-1.2B-Thinking](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking), a 1.2 billion parameter "thinking" language model by **Liquid AI**. + +The model uses the novel `Lfm2ForCausalLM` architecture featuring a hybrid design of **10 double-gated LIV convolution blocks + 6 GQA attention blocks** — a departure from standard transformer-only designs. This architecture alternates between local convolution-based mixing and sparse global attention, enabling efficient sequence processing with strong reasoning capabilities. + +## Model Details + +| Property | Value | +|---|---| +| **Architecture** | Lfm2ForCausalLM | +| **Parameter Count** | 1.17B | +| **Layers** | 16 (10 conv blocks + 6 GQA blocks) | +| **Hidden Size** | 2048 | +| **Intermediate (FFN)** | 8192 | +| **Attention Heads** | 32 | +| **KV Heads (GQA)** | 8 (on attention layers) | +| **Context Length** | 32,768 tokens | +| **Vocabulary Size** | 65,536 | +| **Languages** | English, Arabic, Chinese, French, German, Japanese, Korean, Spanish | +| **Quantization** | Q8_0 (8-bit) | +| **File Type** | GGUF | + +## Quantization Details + +This model was quantized using **llama.cpp** with the `Q8_0` scheme: + +- **Source format**: F16 (converted from HuggingFace safetensors) +- **Quantization**: Q8_0 — 8-bit quantization with block-wise scaling +- **Quality**: Near-lossless; ideal for deployment where precision matters +- **Size reduction**: ~50% smaller than F16 while retaining virtually all model quality + +## Usage with llama.cpp +```bash +git clone https://github.com/ggml-org/llama.cpp.git +cd llama.cpp +cmake -B build && cmake --build build --config Release -j$(nproc) + +./build/bin/llama-cli \ + -hf Kelexine/LFM2.5-1.2B-Thinking-GGUF \ + --temp 0.05 --top-k 50 --repeat-penalty 1.05 -n 4096 -cnv +``` + +Or with a local file: +```bash +./build/bin/llama-cli \ + -m LFM2.5-1.2B-Thinking-Q8_0.gguf \ + -p "<|im_start|>user\nYour prompt here<|im_end|>\n<|im_start|>assistant\n" \ + --temp 0.05 --top-k 50 --repeat-penalty 1.05 -n 4096 +``` + +## Usage with Python (llama-cpp-python) +```python +from llama_cpp import Llama + +llm = Llama( + model_path="LFM2.5-1.2B-Thinking-Q8_0.gguf", + n_ctx=4096, + temperature=0.05, + top_k=50, + repeat_penalty=1.05, +) + +response = llm( + "<|im_start|>user\nWhat is machine learning?<|im_end|>\n<|im_start|>assistant\n", + max_tokens=4096, + stop=["<|im_end|>"], +) +print(response["choices"][0]["text"]) +``` + +## Provided Files + +| File | Description | +|---|---| +| `LFM2.5-1.2B-Thinking-Q8_0.gguf` | 8-bit quantized GGUF (recommended) | + +## Limitations + +- This is a 1.17B parameter model — suited for lightweight tasks, quick prototyping, and edge deployment. +- The "Thinking" variant is designed for chain-of-thought reasoning but may produce verbose `...` blocks; strip these in downstream integrations. +- Requires a recent version of llama.cpp with `Lfm2ForCausalLM` architecture support. +- Not recommended for knowledge-intensive tasks or programming per Liquid AI's own guidance. + +## License + +This repository inherits the [LFM 1.0 License](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking/blob/main/LICENSE) from the base model [LiquidAI/LFM2.5-1.2B-Thinking](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking). + +## Credits + +- **Base model**: [Liquid AI](https://www.liquid.ai/) +- **Quantization**: kelexine +- **Framework**: [llama.cpp](https://github.com/ggml-org/llama.cpp) by ggml-org \ No newline at end of file