179 lines
11 KiB
Markdown
179 lines
11 KiB
Markdown
|
|
---
|
|||
|
|
language:
|
|||
|
|
- ja
|
|||
|
|
license: apache-2.0
|
|||
|
|
library_name: transformers
|
|||
|
|
base_model: llm-jp/llm-jp-4-8b-instruct
|
|||
|
|
pipeline_tag: text-generation
|
|||
|
|
tags:
|
|||
|
|
- math
|
|||
|
|
- distillation
|
|||
|
|
- japanese
|
|||
|
|
- elementary-education
|
|||
|
|
- chuugaku-juken
|
|||
|
|
- 算数
|
|||
|
|
- sft
|
|||
|
|
- qlora
|
|||
|
|
---
|
|||
|
|
(Japanese Follows)
|
|||
|
|
|
|||
|
|
# llm-jp-4-8b-instruct-sansu
|
|||
|
|
|
|||
|
|
QLoRA fine-tune of [`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct), distilled from Claude Sonnet 4.6 on **3,189 filtered Japanese elementary-school math (中学受験 算数) training examples** with step-by-step solutions. Target audience: 5–6年生 preparing for 中学校 entrance exams.
|
|||
|
|
|
|||
|
|
This model is best understood as a **style/format distillation artifact**. It tends to produce concise, numbered 算数-style explanations (つるかめ算、面積図、線分図, etc.) rather than the algebraic / LaTeX-heavy responses that the base model often defaults to.
|
|||
|
|
|
|||
|
|
> **Research artifact, not a general-purpose AI service.** This is a narrow-domain student model produced via knowledge distillation from Claude Sonnet 4.6 for non-competing research and educational use, consistent with Anthropic's Commercial Terms §D.4. It is **not** a substitute for Claude/GPT-style general assistants. See [Intended use](#intended-use--%E7%94%A8%E9%80%94) and [Out of scope](#out-of-scope--%E6%83%B3%E5%AE%9A%E5%A4%96%E3%81%AE%E7%94%A8%E9%80%94) below before deploying.
|
|||
|
|
|
|||
|
|
> **Correctness warning:** do not use this model as an answer oracle. A follow-up audit found multiple wrong final answers in both the Sonnet-generated teacher/reference data and this student model's outputs. The supported claim is improved explanation style/readability, not verified answer accuracy.
|
|||
|
|
|
|||
|
|
### Overview
|
|||
|
|
|
|||
|
|
This model is a QLoRA fine-tune of [`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct), distilled from Claude Sonnet 4.6 on **3,189 filtered step-by-step solutions** to Japanese elementary-school math problems used in 中学受験 (private middle-school entrance exams). The raw training split contained 3,213 examples; 24 extremely short records were filtered before training. Target audience: 5th–6th-grade serious learners.
|
|||
|
|
|
|||
|
|
This should be treated as a **style and formatting distillation model**, not an answer-verification model. It tends to produce concise, numbered explanations using *elementary-school arithmetic methods* (つるかめ算 / "crane-and-turtle" method, 差集め算 / "difference-gathering" method, area-diagrams, ratios, etc.) rather than algebraic equations with variables, which fall outside the Japanese elementary curriculum.
|
|||
|
|
|
|||
|
|
### Intended use (research & education)
|
|||
|
|
|
|||
|
|
- Generating step-by-step explanations for elementary-school math problems
|
|||
|
|
- Creating practice / teaching material for 中学受験 students
|
|||
|
|
- Tutoring support where explanations must stay within the elementary curriculum
|
|||
|
|
- Academic and practical research into Japanese-language SLM distillation
|
|||
|
|
|
|||
|
|
### Out of scope
|
|||
|
|
|
|||
|
|
- **Use as a general-purpose AI assistant / chatbot** (this is a narrow-domain artifact)
|
|||
|
|
- Automated answer checking or final-answer generation without independent verification
|
|||
|
|
- Unsupervised educational content shown directly to children
|
|||
|
|
- Middle/high-school math (equations, functions, trigonometry)
|
|||
|
|
- Non-math Japanese tasks (reading comprehension, science, social studies)
|
|||
|
|
- General Japanese chat (not evaluated)
|
|||
|
|
- Use as the foundation of a product or service that competes with Claude, GPT-style assistants, or other general-purpose AI offerings
|
|||
|
|
|
|||
|
|
### Commercial use note
|
|||
|
|
|
|||
|
|
This model is released under Apache 2.0, but **part of the training corpus was generated by Claude Sonnet 4.6**, and use of those outputs is subject to Anthropic's [Commercial Terms §D.4](https://www.anthropic.com/legal/commercial-terms) (restrictions on building competing AI products). Anyone considering commercial deployment should review Anthropic's current terms independently.
|
|||
|
|
|
|||
|
|
### Limitations and known issues
|
|||
|
|
|
|||
|
|
1. **Correctness is out of scope for this release.** Always check the final answer against an official or human-verified source. Soonet 4.6 produced answers are not always correct based on the spot checks.
|
|||
|
|
2. **Distillation ceiling**: this model is bounded by Claude Sonnet 4.6's elementary-math capability. If Sonnet makes a mistake, the student model can inherit it.
|
|||
|
|
3. **The training "solutions" were generated by Sonnet, not authored by human 塾 teachers.** Phrasing may diverge from published model-answer guides.
|
|||
|
|
4. **Not evaluated on non-math tasks.** Performance on reading comprehension, science, social studies, or general chat is unknown.
|
|||
|
|
|
|||
|
|
### How to use
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|||
|
|
|
|||
|
|
model_id = "ynakazat11/llm-jp-4-8b-instruct-sansu"
|
|||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
|||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
|||
|
|
model_id, trust_remote_code=True, torch_dtype="bfloat16", device_map="auto"
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
system = (
|
|||
|
|
"あなたは中学受験算数を教える先生です。"
|
|||
|
|
"問題文を読み、小学生にもわかるように、算数の手法(つるかめ算・差集め算・"
|
|||
|
|
"面積図・線分図・比など)を使って段階的に解説してください。"
|
|||
|
|
"文字式・方程式・代数(x, y などの未知数を立てる方法)は使わないでください。"
|
|||
|
|
)
|
|||
|
|
user = "270mの道の端から端まで桜の木を植えます。木と木の間隔を9mにすると、木は何本植えられますか。"
|
|||
|
|
|
|||
|
|
messages = [{"role": "system", "content": system}, {"role": "user", "content": user}]
|
|||
|
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
|||
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
|||
|
|
out = model.generate(**inputs, max_new_tokens=512, temperature=0.3)
|
|||
|
|
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### License
|
|||
|
|
|
|||
|
|
Released under Apache 2.0. Please also review the license of the base model (`llm-jp/llm-jp-4-8b-instruct`).
|
|||
|
|
|
|||
|
|
### Citation
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
@misc{nakazato2026sansu,
|
|||
|
|
title = {llm-jp-4-8b-instruct-sansu: A distilled SLM for Japanese elementary math instruction},
|
|||
|
|
author = {Yuki Nakazato},
|
|||
|
|
year = {2026},
|
|||
|
|
url = {https://huggingface.co/ynakazat11/llm-jp-4-8b-instruct-sansu},
|
|||
|
|
note = {QLoRA distillation from Claude Sonnet 4.6 on Japanese 中学受験 math problems.},
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
## 🇯🇵 日本語
|
|||
|
|
|
|||
|
|
### モデル概要
|
|||
|
|
|
|||
|
|
中学受験算数(小学5-6年生レベル)の問題に対して、段階的で読みやすい解説形式を出すように、[`llm-jp/llm-jp-4-8b-instruct`](https://huggingface.co/llm-jp/llm-jp-4-8b-instruct) をQLoRAで微調整したモデルです。
|
|||
|
|
|
|||
|
|
教師モデル(Claude Sonnet 4.6)が生成した3,189件の解説を蒸留学習しています。元の訓練候補は3,213問で、極端に短い問題・解説を除外した後の3,189件を学習に使いました。出力は方程式・代数を避け、**つるかめ算・差集め算・面積図・線分図・比** など算数の解法に寄せる方向で調整されています。
|
|||
|
|
|
|||
|
|
重要:このモデルは「正答率を保証するモデル」ではありません。訓練解説はSonnet生成であり、監査では教師データ・モデル出力の両方に誤答が見つかっています。
|
|||
|
|
|
|||
|
|
### 用途(研究・教育目的)
|
|||
|
|
|
|||
|
|
- 中学受験対策(5-6年生向け)の算数解説生成
|
|||
|
|
- 算数の指導教材作成支援
|
|||
|
|
- 小学校算数の問題に対する段階的解説
|
|||
|
|
- 日本語SLMの蒸留手法に関する学術・実務研究
|
|||
|
|
|
|||
|
|
### 想定外の用途
|
|||
|
|
|
|||
|
|
- **汎用AIアシスタント/チャットボット用途**(本モデルは特定ドメイン専用です)
|
|||
|
|
- 自動採点、正答生成、答え合わせの最終判断
|
|||
|
|
- 子どもへ無監督で提示する教材生成
|
|||
|
|
- 中学・高校以降の数学(方程式、関数、三角比など)の解説
|
|||
|
|
- 国語、理科、社会など算数以外の科目
|
|||
|
|
- 算数以外の日本語タスク全般(性能評価していません)
|
|||
|
|
- Claude/GPT等のフロンティアAIサービスと競合するプロダクトの基盤としての使用
|
|||
|
|
|
|||
|
|
### 商用利用について
|
|||
|
|
|
|||
|
|
本モデルはApache 2.0ライセンスでリリースしていますが、**訓練データの一部はClaude Sonnet 4.6が生成しており**、Anthropic社の[Commercial Terms](https://www.anthropic.com/legal/commercial-terms)第D.4条(競合製品の禁止)の制約を受けます。本モデルを商用利用する際は、利用者ご自身でAnthropicの最新の利用規約を確認してください。
|
|||
|
|
|
|||
|
|
### 注意・既知の制約
|
|||
|
|
|
|||
|
|
1. **正答率は今回のスコープ外です。** 最終的な答えは必ず人間または公式解答で確認してください。Sonnet 4.6の回答に誤答が複数検知されています。
|
|||
|
|
2. **蒸留の上限は教師モデル(Claude Sonnet 4.6)の能力に依存します。** Sonnet 4.6 が間違える問題では、本モデルも間違える可能性があります。
|
|||
|
|
3. **訓練データの解説はSonnet 4.6が生成したものであり、塾講師の解説そのものではありません。** 模範解答書の解説とは表現が異なる場合があります。
|
|||
|
|
4. **算数以外のタスクには評価していません。** 国語の文章題、理科、社会、雑談などへの性能は不明です。
|
|||
|
|
|
|||
|
|
### 使用例
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|||
|
|
|
|||
|
|
model_id = "ynakazat11/llm-jp-4-8b-instruct-sansu"
|
|||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
|||
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, torch_dtype="bfloat16", device_map="auto")
|
|||
|
|
|
|||
|
|
system = (
|
|||
|
|
"あなたは中学受験算数を教える先生です。"
|
|||
|
|
"問題文を読み、小学生にもわかるように、算数の手法(つるかめ算・差集め算・"
|
|||
|
|
"面積図・線分図・比など)を使って段階的に解説してください。"
|
|||
|
|
"文字式・方程式・代数(x, y などの未知数を立てる方法)は使わないでください。"
|
|||
|
|
)
|
|||
|
|
user = "270mの道の端から端まで桜の木を植えます。木と木の間隔を9mにすると、木は何本植えられますか。"
|
|||
|
|
|
|||
|
|
messages = [{"role": "system", "content": system}, {"role": "user", "content": user}]
|
|||
|
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
|||
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
|||
|
|
out = model.generate(**inputs, max_new_tokens=512, temperature=0.3)
|
|||
|
|
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### ライセンス
|
|||
|
|
|
|||
|
|
本モデルはApache 2.0でリリースしています。ベースモデル(`llm-jp/llm-jp-4-8b-instruct`)のライセンスも併せてご確認ください。
|
|||
|
|
|
|||
|
|
# Uploaded finetuned model
|
|||
|
|
|
|||
|
|
- **Developed by:** ynakazat11
|
|||
|
|
- **License:** apache-2.0
|
|||
|
|
- **Finetuned from model :** llm-jp/llm-jp-4-8b-instruct
|
|||
|
|
|
|||
|
|
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
|||
|
|
|
|||
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|