---
language:
- ja
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- safetensors
- lfm2
- liquid
- lfm2.5
- edge
- conversational
license: other
license_name: lfm1.0
license_link: LICENSE
arxiv:
- 2511.23404
base_model:
- LiquidAI/LFM2.5-1.2B-Base
---
# 🇯🇵 LFM2.5-1.2B-JP-202606
**LFM2.5-1.2B-JP-202606** is our latest general purpose Japanese chat model, delivering significant improvements in knowledge, instruction following, math, code, and tool-use over both the models of comparable size and [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP). It sets a new benchmark for state-of-the-art performance in Japanese language understanding.
Ideal for developers building Japanese-language applications where cultural and linguistic nuance matter.
**LFM2.5-1.2B-JP-202606** は、当社の最新の汎用日本語チャットモデルです。知識、指示追従、数学、コード、ツール使用の各領域において、同規模の他モデルおよび [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP) の双方を大幅に上回る改善を実現しています。日本語全般における最高水準のベンチマーク性能を発揮します。
文化的・言語的なニュアンスが重要となる日本語アプリケーションを構築する開発者に最適です。
Find more information about LFM2.5 in our [blog post](https://www.liquid.ai/blog/introducing-lfm2-5-the-next-generation-of-on-device-ai).
## 📊 Performance
We compared LFM2.5-1.2B-JP-202606 with relevant sub-2B models on a diverse suite of benchmarks.
| Model |
Size |
Knowledge |
Instruction Following |
Math |
Code |
Tool Use |
Domain Avg |
| JMMLU‑ProX |
JMMLU |
JCulture |
JGPQA |
Avg |
J‑MIFEval |
JFBench1 |
Avg |
J‑GSM8K |
J‑MATH500 |
Avg |
JHumanEval+ |
J‑BFCLv32 |
| LFM2.5‑1.2B‑JP‑202606 |
1.2B |
36.23 | 54.19 | 35.77 | 28.69 | 38.72 |
79.08 | 54.77 | 66.93 |
62.20 | 62.80 | 62.50 |
49.39 |
48.00 |
53.11 |
| LFM2.5‑1.2B‑Instruct |
1.2B |
31.42 | 47.61 | 28.42 | 31.72 | 34.79 |
40.44 | 36.67 | 38.56 |
50.20 | 50.00 | 50.10 |
28.66 |
46.29 |
39.68 |
| Qwen3‑1.7B (Instruct) |
1.7B |
30.78 | 47.67 | 33.33 | 26.26 | 34.51 |
40.29 | 36.61 | 38.45 |
46.00 | 56.40 | 51.20 |
47.56 |
52.45 |
44.83 |
| Granite‑4.0‑1B |
1.5B |
15.32 | 33.93 | 34.38 | 24.44 | 27.02 |
27.56 | 31.26 | 29.41 |
42.80 | 25.40 | 34.10 |
51.22 |
50.57 |
38.46 |
| Llama‑3.2‑1B‑Instruct |
1.2B |
15.91 | 33.97 | 22.52 | 32.32 | 26.18 |
24.10 | 21.78 | 22.94 |
25.20 | 11.40 | 18.30 |
17.68 |
21.06 |
21.23 |
| Gemma‑3‑1B‑it |
1.0B |
14.12 | 34.45 | 23.42 | 24.24 | 24.06 |
26.31 | 31.15 | 28.73 |
33.60 | 15.60 | 24.60 |
25.00 |
17.26 |
23.93 |
| sarashina2.2‑1b‑instruct‑v0.1 |
1.4B |
18.3 | 40.24 | 25.53 | 26.26 | 27.58 |
21.9 | 27.41 | 24.66 |
44.4 | 24.8 | 34.60 |
21.95 |
13.86 |
24.53 |
| TinySwallow‑1.5B‑Instruct |
1.5B |
21.51 | 47.98 | 31.17 | 29.29 | 32.49 |
36.55 | 34.25 | 35.40 |
47.2 | 22.4 | 34.80 |
26.83 |
11.7 |
28.24 |
| llm‑jp‑3.1‑1.8b‑instruct4 |
1.9B |
17.44 | 43.05 | 27.42 | 17.68 | 26.40 |
33.77 | 30.92 | 32.35 |
52.8 | 17.0 | 34.90 |
35.37 |
11.76 |
28.16 |
| RakutenAI‑2.0‑mini‑instruct |
1.5B |
11.46 | 31.84 | 29.67 | 22.22 | 23.80 |
28.06 | 24.66 | 26.36 |
24.8 | 11.4 | 18.10 |
28.6 |
11.85 |
21.74 |
*1 JFBench is evaluated using single-instruction prompts.*
*2 quickTestingOSSHandler is used for models that do not support function calling (sarashina2.2‑1b‑instruct‑v0.1, TinySwallow‑1.5B‑Instruct, llm‑jp‑3.1‑1.8b‑instruct4, and RakutenAI‑2.0‑mini‑instruct).*
## 🗒️ Model Details
| Model | Parameters | Description |
|-------|------------|-------------|
| [LFM2.5-1.2B-Base](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base) | 1.2B | Pre-trained base model for fine-tuning |
| [LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | 1.2B | General-purpose instruction-tuned model |
| [LFM2.5-1.2B-Thinking](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking) | 1.2B | General-purpose reasoning model |
| [**LFM2.5-1.2B-JP-202606**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | 1.2B | Japanese-capable chat model |
| [LFM2.5-VL-1.6B](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B) | 1.6B | Vision-language model with fast inference |
| [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) | 1.5B | Audio-language model for speech and text I/O |
| [LFM2.5-Audio-1.5B-JP](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-JP) | 1.5B | Japanese-capable audio model for speech and text I/O |
LFM2.5-1.2B-JP-202606 is a general-purpose text-only model with the following features:
- **Number of parameters**: 1.17B
- **Number of layers**: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks)
- **Training budget**: 31.5T tokens
- **Context length**: 32,768 tokens
- **Vocabulary size**: 65,536
- **Knowledge cutoff**: Mid-2024
- **Languages**: English, Japanese
- **Generation parameters**:
- `temperature: 0.1`
- `top_k: 50`
- `repetition_penalty: 1.05`
| Model | Description |
|-------|-------------|
| [LFM2.5-1.2B-JP-202606](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM. |
| [LFM2.5-1.2B-JP-202606-GGUF](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage. |
| [LFM2.5-1.2B-JP-202606-ONNX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-ONNX) | ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile). |
| [LFM2.5-1.2B-JP-202606-MLX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-MLX-8bit) | MLX format for Apple Silicon. Optimized for fast inference on Mac devices using the MLX framework. |
We recommend using it for agentic workflows, tool use, structured outputs, bilingual English–Japanese assistants, and on-device personal-assistant applications. It is not recommended for knowledge-intensive tasks. It performs best when given clear, explicit instructions that define the task, expected behavior, and output format.
エージェント型ワークフロー、ツール使用、構造化出力、日英バイリンガルアシスタント、オンデバイスのパーソナルアシスタントでの利用を推奨します。一方で、詳細な知識を要するのタスクには推奨されません。タスク内容、期待される動作、出力形式を明確かつ具体的に指示することで、最も高い性能を発揮します。
### Chat Template
LFM2.5 uses a ChatML-like format. See the [Chat Template documentation](https://docs.liquid.ai/lfm/key-concepts/chat-template) for details. Example:
```
<|startoftext|><|im_start|>system
You are a helpful assistant trained by Liquid AI.<|im_end|>
<|im_start|>user
日本の首都は?<|im_end|>
<|im_start|>assistant
```
You can use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#using-applychattemplate) to format your messages automatically.
### Tool Use
LFM2.5 supports function calling as follows:
1. **Function definition**: We recommend providing the list of tools as a JSON object in the system prompt. You can also use the [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_extras#passing-tools) function with tools.
2. **Function call**: By default, LFM2.5 writes Pythonic function calls (a Python list between `<|tool_call_start|>` and `<|tool_call_end|>` special tokens), as the assistant answer. You can override this behavior by asking the model to output JSON function calls in the system prompt.
3. **Function execution**: The function call is executed, and the result is returned as a "tool" role.
4. **Final answer**: LFM2 interprets the outcome of the function call to address the original user prompt in plain text.
See the [Tool Use documentation](https://docs.liquid.ai/lfm/key-concepts/tool-use) for the full guide. Example:
```
<|startoftext|><|im_start|>system
List of tools: [{"name": "get_candidate_status", "description": "採用プロセスにおける候補者の現在のステータスを取得します", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "候補者の一意の識別子"}}, "required": ["candidate_id"]}}]<|im_end|>
<|im_start|>user
候補者ID 12345 の現在のステータスは何ですか?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>候補者ID 12345 の現在のステータスを確認しています。<|im_end|>
<|im_start|>tool
[{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|>
<|im_start|>assistant
ID 12345 の候補者は現在、Clinical Research Associate のポジションで「面接予定」の段階にあり、面接日は 2023年11月20日に設定されています。<|im_end|>
```
## 🏃 Inference
LFM2.5 is supported by many inference frameworks. See the [Inference documentation](https://docs.liquid.ai/lfm/inference/transformers) for the full list.
| Name | Description | Docs | Notebook |
|------|-------------|------|:--------:|
| [Transformers](https://github.com/huggingface/transformers) | Simple inference with direct access to model internals. | Link |
|
| [vLLM](https://github.com/vllm-project/vllm) | High-throughput production deployments with GPU. | Link |
|
| [llama.cpp](https://github.com/ggml-org/llama.cpp) | Cross-platform inference with CPU offloading. | Link |
|
| [MLX](https://github.com/ml-explore/mlx) | Apple's machine learning framework optimized for Apple Silicon. | Link | — |
| [LM Studio](https://lmstudio.ai/) | Desktop application for running LLMs locally. | Link | — |
Here's a quick start example with Transformers:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_id = "LFM2.5-1.2B-JP-202606"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
dtype="bfloat16",
# attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
prompt = "日本の首都は?"
input_ids = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
).to(model.device)
output = model.generate(
input_ids,
do_sample=True,
temperature=0.1,
top_k=50,
repetition_penalty=1.05,
max_new_tokens=512,
streamer=streamer,
)
```
## 🔧 Fine-Tuning
We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.
| Name | Description | Docs | Notebook |
|------|-------------|------|----------|
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for text completion. | Link |
|
| CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for translation. | Link |
|
| SFT ([Unsloth](https://github.com/unslothai/unsloth)) | Supervised Fine-Tuning with LoRA using Unsloth. | Link |
|
| SFT ([TRL](https://github.com/huggingface/trl)) | Supervised Fine-Tuning with LoRA using TRL. | Link |
|
| DPO ([TRL](https://github.com/huggingface/trl)) | Direct Preference Optimization with LoRA using TRL. | Link |
|
| GRPO ([Unsloth](https://github.com/unslothai/unsloth)) | GRPO with LoRA using Unsloth. | Link |
|
| GRPO ([TRL](https://github.com/huggingface/trl)) | GRPO with LoRA using TRL. | Link |
|
## 📬 Contact
- Got questions or want to connect? [Join our Discord community](https://discord.com/invite/liquid-ai)
- If you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact).
## Citation
```bibtex
@article{liquidai2025lfm2,
title={LFM2 Technical Report},
author={Liquid AI},
journal={arXiv preprint arXiv:2511.23404},
year={2025}
}
```