Files
LFM2.5-1.2B-JP-202606/README.md
ModelHub XC c64c2a19c9 初始化项目,由ModelHub XC社区提供模型
Model: LiquidAI/LFM2.5-1.2B-JP-202606
Source: Original Platform
2026-06-10 05:55:17 +08:00

349 lines
18 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
language:
- ja
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- safetensors
- lfm2
- liquid
- lfm2.5
- edge
- conversational
license: other
license_name: lfm1.0
license_link: LICENSE
arxiv:
- 2511.23404
base_model:
- LiquidAI/LFM2.5-1.2B-Base
---
<br>
<div align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/2b08LKpev0DNEk6DlnWkY.png"
alt="Liquid AI"
style="width: 100%; max-width: 100%; height: auto; display: inline-block; margin-bottom: 0.5em; margin-top: 0.5em;"
/>
<div style="display: flex; justify-content: center; gap: 0.5em;">
<a href="https://playground.liquid.ai/"><strong>Try LFM</strong></a><a href="https://docs.liquid.ai/lfm/getting-started/welcome"><strong>Docs</strong></a><a href="https://leap.liquid.ai/"><strong>LEAP</strong></a><a href="https://discord.com/invite/liquid-ai"><strong>Discord</strong></a>
</div>
</div>
<br>
# 🇯🇵 LFM2.5-1.2B-JP-202606
<div align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/64618cf05dba83471db2be9b/nhW5KrNVrPIe-3zy-RLLt.png" alt="Liquid AI" width="70%" />
</div>
**LFM2.5-1.2B-JP-202606** is our latest general purpose Japanese chat model, delivering significant improvements in knowledge, instruction following, math, code, and tool-use over both the models of comparable size and [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP). It sets a new benchmark for state-of-the-art performance in Japanese language understanding.
Ideal for developers building Japanese-language applications where cultural and linguistic nuance matter.
**LFM2.5-1.2B-JP-202606** は、当社の最新の汎用日本語チャットモデルです。知識、指示追従、数学、コード、ツール使用の各領域において、同規模の他モデルおよび [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP) の双方を大幅に上回る改善を実現しています。日本語全般における最高水準のベンチマーク性能を発揮します。
文化的・言語的なニュアンスが重要となる日本語アプリケーションを構築する開発者に最適です。
Find more information about LFM2.5 in our [blog post](https://www.liquid.ai/blog/introducing-lfm2-5-the-next-generation-of-on-device-ai).
## 📊 Performance
<div align="center">
<img
src="https://cdn-uploads.huggingface.co/production/uploads/64618cf05dba83471db2be9b/7gqajAlXAh52nAz85JoQt.png"
alt="Liquid AI"
width="90%"
/>
<div style="display: flex; justify-content: center; gap: 0.5em;">
</div>
</div>
We compared LFM2.5-1.2B-JP-202606 with relevant sub-2B models on a diverse suite of benchmarks.
<table>
<thead>
<tr>
<th rowspan="2">Model</th>
<th rowspan="2">Size</th>
<th colspan="5">Knowledge</th>
<th colspan="3">Instruction Following</th>
<th colspan="3">Math</th>
<th>Code</th>
<th>Tool Use</th>
<th rowspan="2">Domain Avg</th>
</tr>
<tr>
<th>JMMLUProX</th>
<th>JMMLU</th>
<th>JCulture</th>
<th>JGPQA</th>
<th>Avg</th>
<th>JMIFEval</th>
<th>JFBench<sup>1</sup></th>
<th>Avg</th>
<th>JGSM8K</th>
<th>JMATH500</th>
<th>Avg</th>
<th>JHumanEval+</th>
<th>JBFCLv3<sup>2</sup></th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>LFM2.51.2BJP202606</strong></td>
<td>1.2B</td>
<td>36.23</td><td>54.19</td><td>35.77</td><td>28.69</td><td>38.72</td>
<td>79.08</td><td>54.77</td><td>66.93</td>
<td>62.20</td><td>62.80</td><td>62.50</td>
<td>49.39</td>
<td>48.00</td>
<td><strong>53.11</strong></td>
</tr>
<tr>
<td>LFM2.51.2BInstruct</td>
<td>1.2B</td>
<td>31.42</td><td>47.61</td><td>28.42</td><td>31.72</td><td>34.79</td>
<td>40.44</td><td>36.67</td><td>38.56</td>
<td>50.20</td><td>50.00</td><td>50.10</td>
<td>28.66</td>
<td>46.29</td>
<td>39.68</td>
</tr>
<tr>
<td>Qwen31.7B (Instruct)</td>
<td>1.7B</td>
<td>30.78</td><td>47.67</td><td>33.33</td><td>26.26</td><td>34.51</td>
<td>40.29</td><td>36.61</td><td>38.45</td>
<td>46.00</td><td>56.40</td><td>51.20</td>
<td>47.56</td>
<td>52.45</td>
<td>44.83</td>
</tr>
<tr>
<td>Granite4.01B</td>
<td>1.5B</td>
<td>15.32</td><td>33.93</td><td>34.38</td><td>24.44</td><td>27.02</td>
<td>27.56</td><td>31.26</td><td>29.41</td>
<td>42.80</td><td>25.40</td><td>34.10</td>
<td>51.22</td>
<td>50.57</td>
<td>38.46</td>
</tr>
<tr>
<td>Llama3.21BInstruct</td>
<td>1.2B</td>
<td>15.91</td><td>33.97</td><td>22.52</td><td>32.32</td><td>26.18</td>
<td>24.10</td><td>21.78</td><td>22.94</td>
<td>25.20</td><td>11.40</td><td>18.30</td>
<td>17.68</td>
<td>21.06</td>
<td>21.23</td>
</tr>
<tr>
<td>Gemma31Bit</td>
<td>1.0B</td>
<td>14.12</td><td>34.45</td><td>23.42</td><td>24.24</td><td>24.06</td>
<td>26.31</td><td>31.15</td><td>28.73</td>
<td>33.60</td><td>15.60</td><td>24.60</td>
<td>25.00</td>
<td>17.26</td>
<td>23.93</td>
</tr>
<tr>
<td>sarashina2.21binstructv0.1</td>
<td>1.4B</td>
<td>18.3</td><td>40.24</td><td>25.53</td><td>26.26</td><td>27.58</td>
<td>21.9</td><td>27.41</td><td>24.66</td>
<td>44.4</td><td>24.8</td><td>34.60</td>
<td>21.95</td>
<td>13.86</td>
<td>24.53</td>
</tr>
<tr>
<td>TinySwallow1.5BInstruct</td>
<td>1.5B</td>
<td>21.51</td><td>47.98</td><td>31.17</td><td>29.29</td><td>32.49</td>
<td>36.55</td><td>34.25</td><td>35.40</td>
<td>47.2</td><td>22.4</td><td>34.80</td>
<td>26.83</td>
<td>11.7</td>
<td>28.24</td>
</tr>
<tr>
<td>llmjp3.11.8binstruct4</td>
<td>1.9B</td>
<td>17.44</td><td>43.05</td><td>27.42</td><td>17.68</td><td>26.40</td>
<td>33.77</td><td>30.92</td><td>32.35</td>
<td>52.8</td><td>17.0</td><td>34.90</td>
<td>35.37</td>
<td>11.76</td>
<td>28.16</td>
</tr>
<tr>
<td>RakutenAI2.0miniinstruct</td>
<td>1.5B</td>
<td>11.46</td><td>31.84</td><td>29.67</td><td>22.22</td><td>23.80</td>
<td>28.06</td><td>24.66</td><td>26.36</td>
<td>24.8</td><td>11.4</td><td>18.10</td>
<td>28.6</td>
<td>11.85</td>
<td>21.74</td>
</tr>
</tbody>
</table>
*<sup>1</sup> JFBench is evaluated using single-instruction prompts.* <br>
*<sup>2</sup> quickTestingOSSHandler is used for models that do not support function calling (sarashina2.21binstructv0.1, TinySwallow1.5BInstruct, llmjp3.11.8binstruct4, and RakutenAI2.0miniinstruct).*
## 🗒️ Model Details
| Model | Parameters | Description |
|-------|------------|-------------|
| [LFM2.5-1.2B-Base](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base) | 1.2B | Pre-trained base model for fine-tuning |
| [LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | 1.2B | General-purpose instruction-tuned model |
| [LFM2.5-1.2B-Thinking](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking) | 1.2B | General-purpose reasoning model |
| [**LFM2.5-1.2B-JP-202606**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | 1.2B | Japanese-capable chat model |
| [LFM2.5-VL-1.6B](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B) | 1.6B | Vision-language model with fast inference |
| [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) | 1.5B | Audio-language model for speech and text I/O |
| [LFM2.5-Audio-1.5B-JP](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-JP) | 1.5B | Japanese-capable audio model for speech and text I/O |
LFM2.5-1.2B-JP-202606 is a general-purpose text-only model with the following features:
- **Number of parameters**: 1.17B
- **Number of layers**: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks)
- **Training budget**: 31.5T tokens
- **Context length**: 32,768 tokens
- **Vocabulary size**: 65,536
- **Knowledge cutoff**: Mid-2024
- **Languages**: English, Japanese
- **Generation parameters**:
- `temperature: 0.1`
- `top_k: 50`
- `repetition_penalty: 1.05`
| Model | Description |
|-------|-------------|
| [LFM2.5-1.2B-JP-202606](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM. |
| [LFM2.5-1.2B-JP-202606-GGUF](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage. |
| [LFM2.5-1.2B-JP-202606-ONNX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-ONNX) | ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile). |
| [LFM2.5-1.2B-JP-202606-MLX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-MLX-8bit) | MLX format for Apple Silicon. Optimized for fast inference on Mac devices using the MLX framework. |
We recommend using it for agentic workflows, tool use, structured outputs, bilingual EnglishJapanese assistants, and on-device personal-assistant applications. It is not recommended for knowledge-intensive tasks. It performs best when given clear, explicit instructions that define the task, expected behavior, and output format.
エージェント型ワークフロー、ツール使用、構造化出力、日英バイリンガルアシスタント、オンデバイスのパーソナルアシスタントでの利用を推奨します。一方で、詳細な知識を要するのタスクには推奨されません。タスク内容、期待される動作、出力形式を明確かつ具体的に指示することで、最も高い性能を発揮します。
### Chat Template
LFM2.5 uses a ChatML-like format. See the [Chat Template documentation](https://docs.liquid.ai/lfm/key-concepts/chat-template) for details. Example:
```
<|startoftext|><|im_start|>system
You are a helpful assistant trained by Liquid AI.<|im_end|>
<|im_start|>user
日本の首都は?<|im_end|>
<|im_start|>assistant
```
You can use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#using-applychattemplate) to format your messages automatically.
### Tool Use
LFM2.5 supports function calling as follows:
1. **Function definition**: We recommend providing the list of tools as a JSON object in the system prompt. You can also use the [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_extras#passing-tools) function with tools.
2. **Function call**: By default, LFM2.5 writes Pythonic function calls (a Python list between `<|tool_call_start|>` and `<|tool_call_end|>` special tokens), as the assistant answer. You can override this behavior by asking the model to output JSON function calls in the system prompt.
3. **Function execution**: The function call is executed, and the result is returned as a "tool" role.
4. **Final answer**: LFM2 interprets the outcome of the function call to address the original user prompt in plain text.
See the [Tool Use documentation](https://docs.liquid.ai/lfm/key-concepts/tool-use) for the full guide. Example:
```
<|startoftext|><|im_start|>system
List of tools: [{"name": "get_candidate_status", "description": "採用プロセスにおける候補者の現在のステータスを取得します", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "候補者の一意の識別子"}}, "required": ["candidate_id"]}}]<|im_end|>
<|im_start|>user
候補者ID 12345 の現在のステータスは何ですか?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>候補者ID 12345 の現在のステータスを確認しています。<|im_end|>
<|im_start|>tool
[{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|>
<|im_start|>assistant
ID 12345 の候補者は現在、Clinical Research Associate のポジションで「面接予定」の段階にあり、面接日は 2023年11月20日に設定されています。<|im_end|>
```
## 🏃 Inference
LFM2.5 is supported by many inference frameworks. See the [Inference documentation](https://docs.liquid.ai/lfm/inference/transformers) for the full list.
| Name | Description | Docs | Notebook |
|------|-------------|------|:--------:|
| [Transformers](https://github.com/huggingface/transformers) | Simple inference with direct access to model internals. | <a href="https://docs.liquid.ai/lfm/inference/transformers">Link</a> | <a href="https://colab.research.google.com/drive/1_q3jQ6LtyiuPzFZv7Vw8xSfPU5FwkKZY?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [vLLM](https://github.com/vllm-project/vllm) | High-throughput production deployments with GPU. | <a href="https://docs.liquid.ai/lfm/inference/vllm">Link</a> | <a href="https://colab.research.google.com/drive/1VfyscuHP8A3we_YpnzuabYJzr5ju0Mit?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [llama.cpp](https://github.com/ggml-org/llama.cpp) | Cross-platform inference with CPU offloading. | <a href="https://docs.liquid.ai/lfm/inference/llama-cpp">Link</a> | <a href="https://colab.research.google.com/drive/1ohLl3w47OQZA4ELo46i5E4Z6oGWBAyo8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| [MLX](https://github.com/ml-explore/mlx) | Apple's machine learning framework optimized for Apple Silicon. | <a href="https://docs.liquid.ai/lfm/inference/mlx">Link</a> | — |
| [LM Studio](https://lmstudio.ai/) | Desktop application for running LLMs locally. | <a href="https://docs.liquid.ai/lfm/inference/lm-studio">Link</a> | — |
Here's a quick start example with Transformers:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_id = "LFM2.5-1.2B-JP-202606"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
dtype="bfloat16",
# attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
prompt = "日本の首都は?"
input_ids = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
).to(model.device)
output = model.generate(
input_ids,
do_sample=True,
temperature=0.1,
top_k=50,
repetition_penalty=1.05,
max_new_tokens=512,
streamer=streamer,
)
```
## 🔧 Fine-Tuning
We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.
| Name | Description | Docs | Notebook |
|------|-------------|------|----------|
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for text completion. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/10fm7eNMezs-DSn36mF7vAsNYlOsx9YZO?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for translation. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1gaP8yTle2_v35Um8Gpu9239fqbU7UgY8?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| SFT ([Unsloth](https://github.com/unslothai/unsloth)) | Supervised Fine-Tuning with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1vGRg4ksRj__6OLvXkHhvji_Pamv801Ss?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| SFT ([TRL](https://github.com/huggingface/trl)) | Supervised Fine-Tuning with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1j5Hk_SyBb2soUsuhU0eIEA9GwLNRnElF?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| DPO ([TRL](https://github.com/huggingface/trl)) | Direct Preference Optimization with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/drive/1MQdsPxFHeZweGsNx4RH7Ia8lG8PiGE1t?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| GRPO ([Unsloth](https://github.com/unslothai/unsloth)) | GRPO with LoRA using Unsloth. | <a href="https://docs.liquid.ai/lfm/fine-tuning/unsloth">Link</a> | <a href="https://colab.research.google.com/drive/1mIikXFaGvcW4vXOZXLbVTxfBRw_XsXa5?usp=sharing"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
| GRPO ([TRL](https://github.com/huggingface/trl)) | GRPO with LoRA using TRL. | <a href="https://docs.liquid.ai/lfm/fine-tuning/trl">Link</a> | <a href="https://colab.research.google.com/github/Liquid4All/cookbook/blob/main/finetuning/notebooks/grpo_for_verifiable_tasks.ipynb"><img src="https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/vlOyMEjwHa_b_LXysEu2E.png" width="110" alt="Colab link"></a> |
## 📬 Contact
- Got questions or want to connect? [Join our Discord community](https://discord.com/invite/liquid-ai)
- If you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact).
## Citation
```bibtex
@article{liquidai2025lfm2,
title={LFM2 Technical Report},
author={Liquid AI},
journal={arXiv preprint arXiv:2511.23404},
year={2025}
}
```