llm-jp-4-8b-instruct-sansu/README.md

---
language:
- ja
license: apache-2.0
library_name: transformers
base_model: llm-jp/llm-jp-4-8b-instruct
pipeline_tag: text-generation
tags:
- math
- distillation
- japanese
- elementary-education
- chuugaku-juken
- 算数
- sft
- qlora
---
(Japanese Follows)

# llm-jp-4-8b-instruct-sansu

 QLoRA fine-tune of [`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct), distilled from Claude Sonnet 4.6 on **3,189 filtered Japanese elementary-school math (中学受験 算数) training examples** with step-by-step solutions. Target audience: 5–6年生 preparing for 中学校 entrance exams.

This model is best understood as a **style/format distillation artifact**. It tends to produce concise, numbered 算数-style explanations (つるかめ算、面積図、線分図, etc.) rather than the algebraic / LaTeX-heavy responses that the base model often defaults to.

> **Research artifact, not a general-purpose AI service.** This is a narrow-domain student model produced via knowledge distillation from Claude Sonnet 4.6 for non-competing research and educational use, consistent with Anthropic's Commercial Terms §D.4. It is **not** a substitute for Claude/GPT-style general assistants. See [Intended use](#intended-use--%E7%94%A8%E9%80%94) and [Out of scope](#out-of-scope--%E6%83%B3%E5%AE%9A%E5%A4%96%E3%81%AE%E7%94%A8%E9%80%94) below before deploying.

> **Correctness warning:** do not use this model as an answer oracle. A follow-up audit found multiple wrong final answers in both the Sonnet-generated teacher/reference data and this student model's outputs. The supported claim is improved explanation style/readability, not verified answer accuracy.

### Overview

This model is a QLoRA fine-tune of [`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct), distilled from Claude Sonnet 4.6 on **3,189 filtered step-by-step solutions** to Japanese elementary-school math problems used in 中学受験 (private middle-school entrance exams). The raw training split contained 3,213 examples; 24 extremely short records were filtered before training. Target audience: 5th–6th-grade serious learners.

This should be treated as a **style and formatting distillation model**, not an answer-verification model. It tends to produce concise, numbered explanations using *elementary-school arithmetic methods* (つるかめ算 / "crane-and-turtle" method, 差集め算 / "difference-gathering" method, area-diagrams, ratios, etc.) rather than algebraic equations with variables, which fall outside the Japanese elementary curriculum.

### Intended use (research & education)

- Generating step-by-step explanations for elementary-school math problems
- Creating practice / teaching material for 中学受験 students
- Tutoring support where explanations must stay within the elementary curriculum
- Academic and practical research into Japanese-language SLM distillation

### Out of scope

- **Use as a general-purpose AI assistant / chatbot** (this is a narrow-domain artifact)
- Automated answer checking or final-answer generation without independent verification
- Unsupervised educational content shown directly to children
- Middle/high-school math (equations, functions, trigonometry)
- Non-math Japanese tasks (reading comprehension, science, social studies)
- General Japanese chat (not evaluated)
- Use as the foundation of a product or service that competes with Claude, GPT-style assistants, or other general-purpose AI offerings

### Commercial use note

This model is released under Apache 2.0, but **part of the training corpus was generated by Claude Sonnet 4.6**, and use of those outputs is subject to Anthropic's [Commercial Terms §D.4](https://www.anthropic.com/legal/commercial-terms) (restrictions on building competing AI products). Anyone considering commercial deployment should review Anthropic's current terms independently.

### Limitations and known issues

1. **Correctness is out of scope for this release.** Always check the final answer against an official or human-verified source. Soonet 4.6 produced answers are not always correct based on the spot checks.
2. **Distillation ceiling**: this model is bounded by Claude Sonnet 4.6's elementary-math capability. If Sonnet makes a mistake, the student model can inherit it.
3. **The training "solutions" were generated by Sonnet, not authored by human 塾 teachers.** Phrasing may diverge from published model-answer guides.
4. **Not evaluated on non-math tasks.** Performance on reading comprehension, science, social studies, or general chat is unknown.

### How to use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ynakazat11/llm-jp-4-8b-instruct-sansu"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, trust_remote_code=True, torch_dtype="bfloat16", device_map="auto"
)

system = (
    "あなたは中学受験算数を教える先生です。"
    "問題文を読み、小学生にもわかるように、算数の手法（つるかめ算・差集め算・"
    "面積図・線分図・比など）を使って段階的に解説してください。"
    "文字式・方程式・代数（x, y などの未知数を立てる方法）は使わないでください。"
)
user = "270mの道の端から端まで桜の木を植えます。木と木の間隔を9mにすると、木は何本植えられますか。"

messages = [{"role": "system", "content": system}, {"role": "user", "content": user}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```

### License

Released under Apache 2.0. Please also review the license of the base model (`llm-jp/llm-jp-4-8b-instruct`).

### Citation

```
@misc{nakazato2026sansu,
  title  = {llm-jp-4-8b-instruct-sansu: A distilled SLM for Japanese elementary math instruction},
  author = {Yuki Nakazato},
  year   = {2026},
  url    = {https://huggingface.co/ynakazat11/llm-jp-4-8b-instruct-sansu},
  note   = {QLoRA distillation from Claude Sonnet 4.6 on Japanese 中学受験 math problems.},
}
```
## 🇯🇵 日本語

### モデル概要

中学受験算数（小学5-6年生レベル）の問題に対して、段階的で読みやすい解説形式を出すように、[`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct) をQLoRAで微調整したモデルです。

教師モデル（Claude Sonnet 4.6）が生成した3,189件の解説を蒸留学習しています。元の訓練候補は3,213問で、極端に短い問題・解説を除外した後の3,189件を学習に使いました。出力は方程式・代数を避け、**つるかめ算・差集め算・面積図・線分図・比** など算数の解法に寄せる方向で調整されています。

重要：このモデルは「正答率を保証するモデル」ではありません。訓練解説はSonnet生成であり、監査では教師データ・モデル出力の両方に誤答が見つかっています。

### 用途（研究・教育目的）

- 中学受験対策（5-6年生向け）の算数解説生成
- 算数の指導教材作成支援
- 小学校算数の問題に対する段階的解説
- 日本語SLMの蒸留手法に関する学術・実務研究

### 想定外の用途

- **汎用AIアシスタント／チャットボット用途**（本モデルは特定ドメイン専用です）
- 自動採点、正答生成、答え合わせの最終判断
- 子どもへ無監督で提示する教材生成
- 中学・高校以降の数学（方程式、関数、三角比など）の解説
- 国語、理科、社会など算数以外の科目
- 算数以外の日本語タスク全般（性能評価していません）
- Claude／GPT等のフロンティアAIサービスと競合するプロダクトの基盤としての使用

### 商用利用について

本モデルはApache 2.0ライセンスでリリースしていますが、**訓練データの一部はClaude Sonnet 4.6が生成しており**、Anthropic社の[Commercial Terms](https://www.anthropic.com/legal/commercial-terms)第D.4条（競合製品の禁止）の制約を受けます。本モデルを商用利用する際は、利用者ご自身でAnthropicの最新の利用規約を確認してください。

### 注意・既知の制約

1. **正答率は今回のスコープ外です。** 最終的な答えは必ず人間または公式解答で確認してください。Sonnet 4.6の回答に誤答が複数検知されています。
2. **蒸留の上限は教師モデル（Claude Sonnet 4.6）の能力に依存します。** Sonnet 4.6 が間違える問題では、本モデルも間違える可能性があります。
3. **訓練データの解説はSonnet 4.6が生成したものであり、塾講師の解説そのものではありません。** 模範解答書の解説とは表現が異なる場合があります。
4. **算数以外のタスクには評価していません。** 国語の文章題、理科、社会、雑談などへの性能は不明です。

### 使用例

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ynakazat11/llm-jp-4-8b-instruct-sansu"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, torch_dtype="bfloat16", device_map="auto")

system = (
    "あなたは中学受験算数を教える先生です。"
    "問題文を読み、小学生にもわかるように、算数の手法（つるかめ算・差集め算・"
    "面積図・線分図・比など）を使って段階的に解説してください。"
    "文字式・方程式・代数（x, y などの未知数を立てる方法）は使わないでください。"
)
user = "270mの道の端から端まで桜の木を植えます。木と木の間隔を9mにすると、木は何本植えられますか。"

messages = [{"role": "system", "content": system}, {"role": "user", "content": user}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```

### ライセンス

本モデルはApache 2.0でリリースしています。ベースモデル（`llm-jp/llm-jp-4-8b-instruct`）のライセンスも併せてご確認ください。

# Uploaded finetuned  model

- **Developed by:** ynakazat11
- **License:** apache-2.0
- **Finetuned from model :** llm-jp/llm-jp-4-8b-instruct

This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)