--- language: - ja - en library_name: transformers pipeline_tag: text-generation tags: - safetensors - lfm2 - liquid - lfm2.5 - edge - conversational license: other license_name: lfm1.0 license_link: LICENSE arxiv: - 2511.23404 base_model: - LiquidAI/LFM2.5-1.2B-Base ---
Liquid AI
Try LFMDocsLEAPDiscord

# 🇯🇵 LFM2.5-1.2B-JP-202606
Liquid AI
**LFM2.5-1.2B-JP-202606** is our latest general purpose Japanese chat model, delivering significant improvements in knowledge, instruction following, math, code, and tool-use over both the models of comparable size and [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP). It sets a new benchmark for state-of-the-art performance in Japanese language understanding. Ideal for developers building Japanese-language applications where cultural and linguistic nuance matter. **LFM2.5-1.2B-JP-202606** は、当社の最新の汎用日本語チャットモデルです。知識、指示追従、数学、コード、ツール使用の各領域において、同規模の他モデルおよび [LFM2.5-1.2B-JP](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP) の双方を大幅に上回る改善を実現しています。日本語全般における最高水準のベンチマーク性能を発揮します。 文化的・言語的なニュアンスが重要となる日本語アプリケーションを構築する開発者に最適です。 Find more information about LFM2.5 in our [blog post](https://www.liquid.ai/blog/introducing-lfm2-5-the-next-generation-of-on-device-ai). ## 📊 Performance
Liquid AI
We compared LFM2.5-1.2B-JP-202606 with relevant sub-2B models on a diverse suite of benchmarks.
Model Size Knowledge Instruction Following Math Code Tool Use Domain Avg
JMMLU‑ProX JMMLU JCulture JGPQA Avg J‑MIFEval JFBench1 Avg J‑GSM8K J‑MATH500 Avg JHumanEval+ J‑BFCLv32
LFM2.5‑1.2B‑JP‑202606 1.2B 36.2354.1935.7728.6938.72 79.0854.7766.93 62.2062.8062.50 49.39 48.00 53.11
LFM2.5‑1.2B‑Instruct 1.2B 31.4247.6128.4231.7234.79 40.4436.6738.56 50.2050.0050.10 28.66 46.29 39.68
Qwen3‑1.7B (Instruct) 1.7B 30.7847.6733.3326.2634.51 40.2936.6138.45 46.0056.4051.20 47.56 52.45 44.83
Granite‑4.0‑1B 1.5B 15.3233.9334.3824.4427.02 27.5631.2629.41 42.8025.4034.10 51.22 50.57 38.46
Llama‑3.2‑1B‑Instruct 1.2B 15.9133.9722.5232.3226.18 24.1021.7822.94 25.2011.4018.30 17.68 21.06 21.23
Gemma‑3‑1B‑it 1.0B 14.1234.4523.4224.2424.06 26.3131.1528.73 33.6015.6024.60 25.00 17.26 23.93
sarashina2.2‑1b‑instruct‑v0.1 1.4B 18.340.2425.5326.2627.58 21.927.4124.66 44.424.834.60 21.95 13.86 24.53
TinySwallow‑1.5B‑Instruct 1.5B 21.5147.9831.1729.2932.49 36.5534.2535.40 47.222.434.80 26.83 11.7 28.24
llm‑jp‑3.1‑1.8b‑instruct4 1.9B 17.4443.0527.4217.6826.40 33.7730.9232.35 52.817.034.90 35.37 11.76 28.16
RakutenAI‑2.0‑mini‑instruct 1.5B 11.4631.8429.6722.2223.80 28.0624.6626.36 24.811.418.10 28.6 11.85 21.74
*1 JFBench is evaluated using single-instruction prompts.*
*2 quickTestingOSSHandler is used for models that do not support function calling (sarashina2.2‑1b‑instruct‑v0.1, TinySwallow‑1.5B‑Instruct, llm‑jp‑3.1‑1.8b‑instruct4, and RakutenAI‑2.0‑mini‑instruct).* ## 🗒️ Model Details | Model | Parameters | Description | |-------|------------|-------------| | [LFM2.5-1.2B-Base](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base) | 1.2B | Pre-trained base model for fine-tuning | | [LFM2.5-1.2B-Instruct](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | 1.2B | General-purpose instruction-tuned model | | [LFM2.5-1.2B-Thinking](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking) | 1.2B | General-purpose reasoning model | | [**LFM2.5-1.2B-JP-202606**](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | 1.2B | Japanese-capable chat model | | [LFM2.5-VL-1.6B](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B) | 1.6B | Vision-language model with fast inference | | [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) | 1.5B | Audio-language model for speech and text I/O | | [LFM2.5-Audio-1.5B-JP](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-JP) | 1.5B | Japanese-capable audio model for speech and text I/O | LFM2.5-1.2B-JP-202606 is a general-purpose text-only model with the following features: - **Number of parameters**: 1.17B - **Number of layers**: 16 (10 double-gated LIV convolution blocks + 6 GQA blocks) - **Training budget**: 31.5T tokens - **Context length**: 32,768 tokens - **Vocabulary size**: 65,536 - **Knowledge cutoff**: Mid-2024 - **Languages**: English, Japanese - **Generation parameters**: - `temperature: 0.1` - `top_k: 50` - `repetition_penalty: 1.05` | Model | Description | |-------|-------------| | [LFM2.5-1.2B-JP-202606](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606) | Original model checkpoint in native format. Best for fine-tuning or inference with Transformers and vLLM. | | [LFM2.5-1.2B-JP-202606-GGUF](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-GGUF) | Quantized format for llama.cpp and compatible tools. Optimized for CPU inference and local deployment with reduced memory usage. | | [LFM2.5-1.2B-JP-202606-ONNX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-ONNX) | ONNX Runtime format for cross-platform deployment. Enables hardware-accelerated inference across diverse environments (cloud, edge, mobile). | | [LFM2.5-1.2B-JP-202606-MLX](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-202606-MLX-8bit) | MLX format for Apple Silicon. Optimized for fast inference on Mac devices using the MLX framework. | We recommend using it for agentic workflows, tool use, structured outputs, bilingual English–Japanese assistants, and on-device personal-assistant applications. It is not recommended for knowledge-intensive tasks. It performs best when given clear, explicit instructions that define the task, expected behavior, and output format. エージェント型ワークフロー、ツール使用、構造化出力、日英バイリンガルアシスタント、オンデバイスのパーソナルアシスタントでの利用を推奨します。一方で、詳細な知識を要するのタスクには推奨されません。タスク内容、期待される動作、出力形式を明確かつ具体的に指示することで、最も高い性能を発揮します。 ### Chat Template LFM2.5 uses a ChatML-like format. See the [Chat Template documentation](https://docs.liquid.ai/lfm/key-concepts/chat-template) for details. Example: ``` <|startoftext|><|im_start|>system You are a helpful assistant trained by Liquid AI.<|im_end|> <|im_start|>user 日本の首都は?<|im_end|> <|im_start|>assistant ``` You can use [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_templating#using-applychattemplate) to format your messages automatically. ### Tool Use LFM2.5 supports function calling as follows: 1. **Function definition**: We recommend providing the list of tools as a JSON object in the system prompt. You can also use the [`tokenizer.apply_chat_template()`](https://huggingface.co/docs/transformers/en/chat_extras#passing-tools) function with tools. 2. **Function call**: By default, LFM2.5 writes Pythonic function calls (a Python list between `<|tool_call_start|>` and `<|tool_call_end|>` special tokens), as the assistant answer. You can override this behavior by asking the model to output JSON function calls in the system prompt. 3. **Function execution**: The function call is executed, and the result is returned as a "tool" role. 4. **Final answer**: LFM2 interprets the outcome of the function call to address the original user prompt in plain text. See the [Tool Use documentation](https://docs.liquid.ai/lfm/key-concepts/tool-use) for the full guide. Example: ``` <|startoftext|><|im_start|>system List of tools: [{"name": "get_candidate_status", "description": "採用プロセスにおける候補者の現在のステータスを取得します", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "候補者の一意の識別子"}}, "required": ["candidate_id"]}}]<|im_end|> <|im_start|>user 候補者ID 12345 の現在のステータスは何ですか?<|im_end|> <|im_start|>assistant <|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>候補者ID 12345 の現在のステータスを確認しています。<|im_end|> <|im_start|>tool [{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}]<|im_end|> <|im_start|>assistant ID 12345 の候補者は現在、Clinical Research Associate のポジションで「面接予定」の段階にあり、面接日は 2023年11月20日に設定されています。<|im_end|> ``` ## 🏃 Inference LFM2.5 is supported by many inference frameworks. See the [Inference documentation](https://docs.liquid.ai/lfm/inference/transformers) for the full list. | Name | Description | Docs | Notebook | |------|-------------|------|:--------:| | [Transformers](https://github.com/huggingface/transformers) | Simple inference with direct access to model internals. | Link | Colab link | | [vLLM](https://github.com/vllm-project/vllm) | High-throughput production deployments with GPU. | Link | Colab link | | [llama.cpp](https://github.com/ggml-org/llama.cpp) | Cross-platform inference with CPU offloading. | Link | Colab link | | [MLX](https://github.com/ml-explore/mlx) | Apple's machine learning framework optimized for Apple Silicon. | Link | — | | [LM Studio](https://lmstudio.ai/) | Desktop application for running LLMs locally. | Link | — | Here's a quick start example with Transformers: ```python from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer model_id = "LFM2.5-1.2B-JP-202606" model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", dtype="bfloat16", # attn_implementation="flash_attention_2" <- uncomment on compatible GPU ) tokenizer = AutoTokenizer.from_pretrained(model_id) streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True) prompt = "日本の首都は?" input_ids = tokenizer.apply_chat_template( [{"role": "user", "content": prompt}], add_generation_prompt=True, return_tensors="pt", tokenize=True, ).to(model.device) output = model.generate( input_ids, do_sample=True, temperature=0.1, top_k=50, repetition_penalty=1.05, max_new_tokens=512, streamer=streamer, ) ``` ## 🔧 Fine-Tuning We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results. | Name | Description | Docs | Notebook | |------|-------------|------|----------| | CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for text completion. | Link | Colab link | | CPT ([Unsloth](https://github.com/unslothai/unsloth)) | Continued Pre-Training using Unsloth for translation. | Link | Colab link | | SFT ([Unsloth](https://github.com/unslothai/unsloth)) | Supervised Fine-Tuning with LoRA using Unsloth. | Link | Colab link | | SFT ([TRL](https://github.com/huggingface/trl)) | Supervised Fine-Tuning with LoRA using TRL. | Link | Colab link | | DPO ([TRL](https://github.com/huggingface/trl)) | Direct Preference Optimization with LoRA using TRL. | Link | Colab link | | GRPO ([Unsloth](https://github.com/unslothai/unsloth)) | GRPO with LoRA using Unsloth. | Link | Colab link | | GRPO ([TRL](https://github.com/huggingface/trl)) | GRPO with LoRA using TRL. | Link | Colab link | ## 📬 Contact - Got questions or want to connect? [Join our Discord community](https://discord.com/invite/liquid-ai) - If you are interested in custom solutions with edge deployment, please contact [our sales team](https://www.liquid.ai/contact). ## Citation ```bibtex @article{liquidai2025lfm2, title={LFM2 Technical Report}, author={Liquid AI}, journal={arXiv preprint arXiv:2511.23404}, year={2025} } ```