Files
LFM2.5-1.2B-JP-202606/README.md

349 lines
18 KiB
Markdown
Raw Permalink Normal View History

---
language:
- ja
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- safetensors
- lfm2
- liquid
- lfm2.5
- edge
- conversational
license: other
license_name: lfm1.0
license_link: LICENSE
arxiv:
- 2511.23404
base_model:
- LiquidAI/LFM2.5-1.2B-Base
---
<br>
<div align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/2b08LKpev0DNEk6DlnWkY.png"
alt="Liquid AI"
style="width: 100%; max-width: 100%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
/>
<div style="display: flex; justify-content: center; gap: 0.5em;">
<a href="https://playground.liquid.ai/"><strong>Try LFM</strong></a><a href="https://docs.liquid.ai/lfm/getting-started/welcome"><strong>Docs</strong></a><a href="https://leap.liquid.ai/"><strong>LEAP</strong></a><a href="https://discord.com/invite/liquid-ai"><strong>Discord</strong></a>
</div>
</div>
<br>
# 🇯🇵 LFM2.5-1.2B-JP-202606
<div align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/64618cf05dba83471db2be9b/nhW5KrNVrPIe-3zy-RLLt.png" alt="Liquid AI" width="70%" />
</div>
**LFM2.5-1.2B-JP-202606** is our latest general purpose Japanese chat model, delivering significant improvements in knowledge, instruction following, math, code, and tool-use over both the models of comparable size and [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP). It sets a new benchmark for state-of-the-art performance in Japanese language understanding.
Ideal for developers building Japanese-language applications where cultural and linguistic nuance matter.
**LFM2.5-1.2B-JP-202606** は、当社の最新の汎用日本語チャットモデルです。知識、指示追従、数学、コード、ツール使用の各領域において、同規模の他モデルおよび [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP) の双方を大幅に上回る改善を実現しています。日本語全般における最高水準のベンチマーク性能を発揮します。
文化的・言語的なニュアンスが重要となる日本語アプリケーションを構築する開発者に最適です。
Find more information about LFM2.5 in our [blog post](https://www.liquid.ai/blog/introducing-lfm2-5-the-next-generation-of-on-device-ai).
## 📊 Performance
<div align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/64618cf05dba83471db2be9b/7gqajAlXAh52nAz85JoQt.png"
alt="Liquid AI"
width="90%"
/>
<div style="display: flex; justify-content: center; gap: 0.5em;">
</div>
</div>
We compared LFM2.5-1.2B-JP-202606 with relevant sub-2B models on a diverse suite of benchmarks.
<table>
<thead>
<tr>
<th rowspan="2">Model</th>
<th rowspan="2">Size</th>
<th colspan="5">Knowledge</th>
<th colspan="3">Instruction Following</th>
<th colspan="3">Math</th>
<th>Code</th>
<th>Tool Use</th>
<th rowspan="2">Domain Avg</th>
</tr>
<tr>
<th>JMMLUProX</th>
<th>JMMLU</th>
<th>JCulture</th>
<th>JGPQA</th>
<th>Avg</th>
<th>JMIFEval</th>
<th>JFBench<sup>1</sup></th>
<th>Avg</th>
<th>JGSM8K</th>
<th>JMATH500</th>
<th>Avg</th>
<th>JHumanEval+</th>
<th>JBFCLv3<sup>2</sup></th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>LFM2.51.2BJP202606</strong></td>
<td>1.2B</td>
<td>36.23</td><td>54.19</td><td>35.77</td><td>28.69</td><td>38.72</td>
<td>79.08</td><td>54.77</td><td>66.93</td>
<td>62.20</td><td>62.80</td><td>62.50</td>
<td>49.39</td>
<td>48.00</td>
<td><strong>53.11</strong></td>
</tr>
<tr>
<td>LFM2.51.2BInstruct</td>
<td>1.2B</td>
<td>31.42</td><td>47.61</td><td>28.42</td><td>31.72</td><td>34.79</td>
<td>40.44</td><td>36.67</td><td>38.56</td>
<td>50.20</td><td>50.00</td><td>50.10</td>
<td>28.66</td>
<td>46.29</td>
<td>39.68</td>
</tr>
<tr>
<td>Qwen31.7B (Instruct)</td>
<td>1.7B</td>
<td>30.78</td><td>47.67</td><td>33.33</td><td>26.26</td><td>34.51</td>
<td>40.29</td><td>36.61</td><td>38.45</td>
<td>46.00</td><td>56.40</td><td>51.20</td>
<td>47.56</td>
<td>52.45</td>
<td>44.83</td>
</tr>
<tr>
<td>Granite4.01B</td>
<td>1.5B</td>
<td>15.32</td><td>33.93</td><td>34.38</td><td>24.44</td><td>27.02</td>
<td>27.56</td><td>31.26</td><td>29.41</td>
<td>42.80</td><td>25.40</td><td>34.10</td>
<td>51.22</td>
<td>50.57</td>
<td>38.46</td>
</tr>
<tr>
<td>Llama3.21BInstruct</td>
<td>1.2B</td>
<td>15.91</td><td>33.97</td><td>22.52</td><td>32.32</td><td>26.18</td>
<td>24.10</td><td>21.78</td><td>22.94</td>
<td>25.20</td><td>11.40</td><td>18.30</td>
<td>17.68</td>
<td>21.06</td>
<td>21.23</td>
</tr>
<tr>
<td>Gemma31Bit</td>
<td>1.0B</td>
<td>14.12</td><td>34.45</td><td>23.42</td><td>24.24</td><td>24.06</td>
<td>26.31</td><td>31.15</td><td>28.73</td>
<td>33.60</td><td>15.60</td><td>24.60</td>
<td>25.00</td>
<td>17.26</td>
<td>23.93</td>
</tr>
<tr>
<td>sarashina2.21binstructv0.1</td>
<td>1.4B</td>
<td>18.3</td><td>40.24</td><td>25.53</td><td>26.26</td><td>27.58</td>
<td>21.9</td><td>27.41</td><td>24.66</td>
<td>44.4</td><td>24.8</td><td>34.60</td>
<td>21.95</td>
<td>13.86</td>
<td>24.53</td>
</tr>
<tr>
<td>TinySwallow1.5BInstruct</td>
<td>1.5B</td>
<td>21.51</td><td>47.98</td><td>31.17</td><td>29.29</td><td>32.49</td>
<td>36.55</td><td>34.25</td><td>35.40</td>
<td>47.2</td><td>22.4</td><td>34.80</td>
<td>26.83</td>
<td>11.7</td>
<td>28.24</td>
</tr>
<tr>
<td>llmjp3.11.8binstruct4</td>
<td>1.9B</td>
<td>17.44</td><td>43.05</td><td>27.42</td><td>17.68</td><td>26.40</td>
<td>33.77</td><td>30.92</td><td>32.35</td>
<td>52.8</td><td>17.0</td><td>34.90</td>
<td>35.37</td>
<td>11.76</td>
<td>28.16</td>
</tr>
<tr>
<td>RakutenAI2.0miniinstruct</td>
<td>1.5B</td>
<td>11.46</td><td>31.84</td><td>29.67</td><td>22.22</td><td>23.80</td>
<td>28.06</td><td>24.66</td><td>26.36</td>
<td>24.8</td><td>11.4</td><td>18.10</td>
<td>28.6</td>
<td>11.85</td>
<td>21.74</td>
</tr>
</tbody>
</table>
*<sup>1</sup> JFBench is evaluated using single-instruction prompts.* <br>
*<sup>2</sup> quickTestingOSSHandler is used for models that do not support function calling (sarashina2.21binstructv0.1, TinySwallow1.5BInstruct, llmjp3.11.8binstruct4, and RakutenAI2.0miniinstruct).*
## 🗒️ Model Details
| Model | Parameters | Description |
|-------|------------|-------------|
| [LFM2.5-1.2B-Base](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base) | 1.2B | Pre-trained base model for fine-tuning |
| [LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | 1.2B | General-purpose instruction-tuned model |
| [LFM2.5-1.2B-Thinking](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking) | 1.2B | General-purpose reasoning model |
| [**LFM2.5-1.2B-JP-202606**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | 1.2B | Japanese-capable chat model |
| [LFM2.5-VL-1.6B](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B) | 1.6B | Vision-language model with fast inference |
| [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) | 1.5B | Audio-language model for speech and text I/O |
| [LFM2.5-Audio-1.5B-JP](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-JP) | 1.5B | Japanese-capable audio model for speech and text I/O |
LFM2.5-1.2B-JP-202606 is a general-purpose text-only model with the following features:
- **Number of parameters**: 1.17B
- **Number of layers**: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks)
- **Training budget**: 31.5T tokens
- **Context length**: 32,768 tokens
- **Vocabulary size**: 65,536
- **Knowledge cutoff**: Mid-2024
- **Languages**: English, Japanese
- **Generation parameters**:
- `temperature: 0.1`
- `top_k: 50`
- `repetition_penalty: 1.05`
| Model | Description |
|-------|-------------|
| [LFM2.5-1.2B-JP-202606](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM. |
| [LFM2.5-1.2B-JP-202606-GGUF](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage. |
| [LFM2.5-1.2B-JP-202606-ONNX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-ONNX) | ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile). |
| [LFM2.5-1.2B-JP-202606-MLX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-MLX-8bit) | MLX format for Apple Silicon. Optimized for fast inference on Mac devices using the MLX framework. |
We recommend using it for agentic workflows, tool use, structured outputs, bilingual EnglishJapanese assistants, and on-device personal-assistant applications. It is not recommended for knowledge-intensive tasks. It performs best when given clear, explicit instructions that define the task, expected behavior, and output format.
エージェント型ワークフロー、ツール使用、構造化出力、日英バイリンガルアシスタント、オンデバイスのパーソナルアシスタントでの利用を推奨します。一方で、詳細な知識を要するのタスクには推奨されません。タスク内容、期待される動作、出力形式を明確かつ具体的に指示することで、最も高い性能を発揮します。
### Chat Template
LFM2.5 uses a ChatML-like format. See the [Chat Template documentation](https://docs.liquid.ai/lfm/key-concepts/chat-template) for details. Example:
```
<|startoftext|><|im_start|>system
You are a helpful assistant trained by Liquid AI.<|im_end|>
<|im_start|>user
日本の首都は?<|im_end|>
<|im_start|>assistant
```
You can use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#using-applychattemplate) to format your messages automatically.
### Tool Use
LFM2.5 supports function calling as follows:
1. **Function definition**: We recommend providing the list of tools as a JSON object in the system prompt. You can also use the [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_extras#passing-tools) function with tools.
2. **Function call**: By default, LFM2.5 writes Pythonic function calls (a Python list between `<|tool_call_start|>` and `<|tool_call_end|>` special tokens), as the assistant answer. You can override this behavior by asking the model to output JSON function calls in the system prompt.
3. **Function execution**: The function call is executed, and the result is returned as a "tool" role.
4. **Final answer**: LFM2 interprets the outcome of the function call to address the original user prompt in plain text.
See the [Tool Use documentation](https://docs.liquid.ai/lfm/key-concepts/tool-use) for the full guide. Example:
```
<|startoftext|><|im_start|>system
List of tools: [{"name": "get_candidate_status", "description": "採用プロセスにおける候補者の現在のステータスを取得します", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "候補者の一意の識別子"}}, "required": ["candidate_id"]}}]<|im_end|>
<|im_start|>user
候補者ID 12345 の現在のステータスは何ですか?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>候補者ID 12345 の現在のステータスを確認しています。<|im_end|>
<|im_start|>tool
[{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|>
<|im_start|>assistant
ID 12345 の候補者は現在、Clinical Research Associate のポジションで「面接予定」の段階にあり、面接日は 2023年11月20日に設定されています。<|im_end|>
```
## 🏃 Inference
LFM2.5 is supported by many inference frameworks. See the [Inference documentation](https://docs.liquid.ai/lfm/inference/transformers) for the full list.
| Name | Description | Docs | Notebook |
|------|-------------|------|:--------:|
| [Transformers](https://github.com/huggingface/transformers) | Simple inference with direct access to model internals. | <a href="https://docs.liquid.ai/lfm/inference/transformers">Link</a> | <a href="https://colab.research.google.com/drive/1_q3jQ6LtyiuPzFZv7Vw8xSfPU5FwkKZY?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [vLLM](https://github.com/vllm-project/vllm) | High-throughput production deployments with GPU. | <a href="https://docs.liquid.ai/lfm/inference/vllm">Link</a> | <a href="https://colab.research.google.com/drive/1VfyscuHP8A3we_YpnzuabYJzr5ju0Mit?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [llama.cpp](https://github.com/ggml-org/llama.cpp) | Cross-platform inference with CPU offloading. | <a href="https://docs.liquid.ai/lfm/inference/llama-cpp">Link</a> | <a href="https://colab.research.google.com/drive/1ohLl3w47OQZA4ELo46i5E4Z6oGWBAyo8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [MLX](https://github.com/ml-explore/mlx) | Apple's machine learning framework optimized for Apple Silicon. | <a href="https://docs.liquid.ai/lfm/inference/mlx">Link</a> | — |
| [LM Studio](https://lmstudio.ai/) | Desktop application for running LLMs locally. | <a href="https://docs.liquid.ai/lfm/inference/lm-studio">Link</a> | — |
Here's a quick start example with Transformers:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_id = "LFM2.5-1.2B-JP-202606"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
dtype="bfloat16",
# attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
prompt = "日本の首都は?"
input_ids = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
).to(model.device)
output = model.generate(
input_ids,
do_sample=True,
temperature=0.1,
top_k=50,
repetition_penalty=1.05,
max_new_tokens=512,
streamer=streamer,
)
```
## 🔧 Fine-Tuning
We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.
| Name | Description | Docs | Notebook |
|------|-------------|------|----------|
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for text completion. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/10fm7eNMezs-DSn36mF7vAsNYlOsx9YZO?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for translation. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1gaP8yTle2_v35Um8Gpu9239fqbU7UgY8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| SFT ([Unsloth](https://github.com/unslothai/unsloth)) | Supervised Fine-Tuning with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1vGRg4ksRj__6OLvXkHhvji_Pamv801Ss?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| SFT ([TRL](https://github.com/huggingface/trl)) | Supervised Fine-Tuning with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1j5Hk_SyBb2soUsuhU0eIEA9GwLNRnElF?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| DPO ([TRL](https://github.com/huggingface/trl)) | Direct Preference Optimization with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1MQdsPxFHeZweGsNx4RH7Ia8lG8PiGE1t?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| GRPO ([Unsloth](https://github.com/unslothai/unsloth)) | GRPO with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1mIikXFaGvcW4vXOZXLbVTxfBRw_XsXa5?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| GRPO ([TRL](https://github.com/huggingface/trl)) | GRPO with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/github/Liquid4All/cookbook/blob/main/finetuning/notebooks/grpo_for_verifiable_tasks.ipynb"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
## 📬 Contact
- Got questions or want to connect? [Join our Discord community](https://discord.com/invite/liquid-ai)
- If you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact).
## Citation
```bibtex
@article{liquidai2025lfm2,
title={LFM2 Technical Report},
author={Liquid AI},
journal={arXiv preprint arXiv:2511.23404},
year={2025}
}
```