--- language: - ja license: apache-2.0 library_name: transformers base_model: llm-jp/llm-jp-4-8b-instruct pipeline_tag: text-generation tags: - math - distillation - japanese - elementary-education - chuugaku-juken - 算数 - sft - qlora --- (Japanese Follows) # llm-jp-4-8b-instruct-sansu QLoRA fine-tune of [`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct), distilled from Claude Sonnet 4.6 on **3,189 filtered Japanese elementary-school math (中学受験 算数) training examples** with step-by-step solutions. Target audience: 5–6年生 preparing for 中学校 entrance exams. This model is best understood as a **style/format distillation artifact**. It tends to produce concise, numbered 算数-style explanations (つるかめ算、面積図、線分図, etc.) rather than the algebraic / LaTeX-heavy responses that the base model often defaults to. > **Research artifact, not a general-purpose AI service.** This is a narrow-domain student model produced via knowledge distillation from Claude Sonnet 4.6 for non-competing research and educational use, consistent with Anthropic's Commercial Terms §D.4. It is **not** a substitute for Claude/GPT-style general assistants. See [Intended use](#intended-use--%E7%94%A8%E9%80%94) and [Out of scope](#out-of-scope--%E6%83%B3%E5%AE%9A%E5%A4%96%E3%81%AE%E7%94%A8%E9%80%94) below before deploying. > **Correctness warning:** do not use this model as an answer oracle. A follow-up audit found multiple wrong final answers in both the Sonnet-generated teacher/reference data and this student model's outputs. The supported claim is improved explanation style/readability, not verified answer accuracy. ### Overview This model is a QLoRA fine-tune of [`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct), distilled from Claude Sonnet 4.6 on **3,189 filtered step-by-step solutions** to Japanese elementary-school math problems used in 中学受験 (private middle-school entrance exams). The raw training split contained 3,213 examples; 24 extremely short records were filtered before training. Target audience: 5th–6th-grade serious learners. This should be treated as a **style and formatting distillation model**, not an answer-verification model. It tends to produce concise, numbered explanations using *elementary-school arithmetic methods* (つるかめ算 / "crane-and-turtle" method, 差集め算 / "difference-gathering" method, area-diagrams, ratios, etc.) rather than algebraic equations with variables, which fall outside the Japanese elementary curriculum. ### Intended use (research & education) - Generating step-by-step explanations for elementary-school math problems - Creating practice / teaching material for 中学受験 students - Tutoring support where explanations must stay within the elementary curriculum - Academic and practical research into Japanese-language SLM distillation ### Out of scope - **Use as a general-purpose AI assistant / chatbot** (this is a narrow-domain artifact) - Automated answer checking or final-answer generation without independent verification - Unsupervised educational content shown directly to children - Middle/high-school math (equations, functions, trigonometry) - Non-math Japanese tasks (reading comprehension, science, social studies) - General Japanese chat (not evaluated) - Use as the foundation of a product or service that competes with Claude, GPT-style assistants, or other general-purpose AI offerings ### Commercial use note This model is released under Apache 2.0, but **part of the training corpus was generated by Claude Sonnet 4.6**, and use of those outputs is subject to Anthropic's [Commercial Terms §D.4](https://www.anthropic.com/legal/commercial-terms) (restrictions on building competing AI products). Anyone considering commercial deployment should review Anthropic's current terms independently. ### Limitations and known issues 1. **Correctness is out of scope for this release.** Always check the final answer against an official or human-verified source. Soonet 4.6 produced answers are not always correct based on the spot checks. 2. **Distillation ceiling**: this model is bounded by Claude Sonnet 4.6's elementary-math capability. If Sonnet makes a mistake, the student model can inherit it. 3. **The training "solutions" were generated by Sonnet, not authored by human 塾 teachers.** Phrasing may diverge from published model-answer guides. 4. **Not evaluated on non-math tasks.** Performance on reading comprehension, science, social studies, or general chat is unknown. ### How to use ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "ynakazat11/llm-jp-4-8b-instruct-sansu" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, trust_remote_code=True, torch_dtype="bfloat16", device_map="auto" ) system = ( "あなたは中学受験算数を教える先生です。" "問題文を読み、小学生にもわかるように、算数の手法(つるかめ算・差集め算・" "面積図・線分図・比など)を使って段階的に解説してください。" "文字式・方程式・代数(x, y などの未知数を立てる方法)は使わないでください。" ) user = "270mの道の端から端まで桜の木を植えます。木と木の間隔を9mにすると、木は何本植えられますか。" messages = [{"role": "system", "content": system}, {"role": "user", "content": user}] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) out = model.generate(**inputs, max_new_tokens=512, temperature=0.3) print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) ``` ### License Released under Apache 2.0. Please also review the license of the base model (`llm-jp/llm-jp-4-8b-instruct`). ### Citation ``` @misc{nakazato2026sansu, title = {llm-jp-4-8b-instruct-sansu: A distilled SLM for Japanese elementary math instruction}, author = {Yuki Nakazato}, year = {2026}, url = {https://huggingface.co/ynakazat11/llm-jp-4-8b-instruct-sansu}, note = {QLoRA distillation from Claude Sonnet 4.6 on Japanese 中学受験 math problems.}, } ``` ## 🇯🇵 日本語 ### モデル概要 中学受験算数(小学5-6年生レベル)の問題に対して、段階的で読みやすい解説形式を出すように、[`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct) をQLoRAで微調整したモデルです。 教師モデル(Claude Sonnet 4.6)が生成した3,189件の解説を蒸留学習しています。元の訓練候補は3,213問で、極端に短い問題・解説を除外した後の3,189件を学習に使いました。出力は方程式・代数を避け、**つるかめ算・差集め算・面積図・線分図・比** など算数の解法に寄せる方向で調整されています。 重要:このモデルは「正答率を保証するモデル」ではありません。訓練解説はSonnet生成であり、監査では教師データ・モデル出力の両方に誤答が見つかっています。 ### 用途(研究・教育目的) - 中学受験対策(5-6年生向け)の算数解説生成 - 算数の指導教材作成支援 - 小学校算数の問題に対する段階的解説 - 日本語SLMの蒸留手法に関する学術・実務研究 ### 想定外の用途 - **汎用AIアシスタント/チャットボット用途**(本モデルは特定ドメイン専用です) - 自動採点、正答生成、答え合わせの最終判断 - 子どもへ無監督で提示する教材生成 - 中学・高校以降の数学(方程式、関数、三角比など)の解説 - 国語、理科、社会など算数以外の科目 - 算数以外の日本語タスク全般(性能評価していません) - Claude/GPT等のフロンティアAIサービスと競合するプロダクトの基盤としての使用 ### 商用利用について 本モデルはApache 2.0ライセンスでリリースしていますが、**訓練データの一部はClaude Sonnet 4.6が生成しており**、Anthropic社の[Commercial Terms](https://www.anthropic.com/legal/commercial-terms)第D.4条(競合製品の禁止)の制約を受けます。本モデルを商用利用する際は、利用者ご自身でAnthropicの最新の利用規約を確認してください。 ### 注意・既知の制約 1. **正答率は今回のスコープ外です。** 最終的な答えは必ず人間または公式解答で確認してください。Sonnet 4.6の回答に誤答が複数検知されています。 2. **蒸留の上限は教師モデル(Claude Sonnet 4.6)の能力に依存します。** Sonnet 4.6 が間違える問題では、本モデルも間違える可能性があります。 3. **訓練データの解説はSonnet 4.6が生成したものであり、塾講師の解説そのものではありません。** 模範解答書の解説とは表現が異なる場合があります。 4. **算数以外のタスクには評価していません。** 国語の文章題、理科、社会、雑談などへの性能は不明です。 ### 使用例 ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "ynakazat11/llm-jp-4-8b-instruct-sansu" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, torch_dtype="bfloat16", device_map="auto") system = ( "あなたは中学受験算数を教える先生です。" "問題文を読み、小学生にもわかるように、算数の手法(つるかめ算・差集め算・" "面積図・線分図・比など)を使って段階的に解説してください。" "文字式・方程式・代数(x, y などの未知数を立てる方法)は使わないでください。" ) user = "270mの道の端から端まで桜の木を植えます。木と木の間隔を9mにすると、木は何本植えられますか。" messages = [{"role": "system", "content": system}, {"role": "user", "content": user}] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) out = model.generate(**inputs, max_new_tokens=512, temperature=0.3) print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) ``` ### ライセンス 本モデルはApache 2.0でリリースしています。ベースモデル(`llm-jp/llm-jp-4-8b-instruct`)のライセンスも併せてご確認ください。 # Uploaded finetuned model - **Developed by:** ynakazat11 - **License:** apache-2.0 - **Finetuned from model :** llm-jp/llm-jp-4-8b-instruct This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)