MiniChat-1.5-3B/README.md

---
language:
- en
- zh
license: apache-2.0
library_name: transformers
widget:
- text: <s> [|User|] Hi 👋  </s>[|Assistant|]
model-index:
- name: MiniChat-1.5-3B
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 46.5
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 68.28
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 46.67
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 50.71
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 65.04
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 24.18
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B
      name: Open LLM Leaderboard
---

## MiniChat-1.5-3B

📑 [arXiv](https://arxiv.org/abs/2311.07052) | 👻 [GitHub](https://github.com/GeneZC/MiniMA) | 🤗 [HuggingFace-MiniMA](https://huggingface.co/GeneZC/MiniMA-3B) | 🤗 [HuggingFace-MiniChat](https://huggingface.co/GeneZC/MiniChat-3B) | 🤗 [HuggingFace-MiniChat-1.5](https://huggingface.co/GeneZC/MiniChat-1.5-3B) | 🤖 [ModelScope-MiniMA](https://modelscope.cn/models/GeneZC/MiniMA-3B) | 🤖 [ModelScope-MiniChat](https://modelscope.cn/models/GeneZC/MiniChat-3B)

🆕 **Updates from MiniChat-3B**: 
- better data mixture;
- use of [NEFTune](https://arxiv.org/abs/2310.05914);
- use of [DPO](https://arxiv.org/abs/2305.18290).

❗ Must comply with LICENSE of LLaMA2 since it is derived from LLaMA2.

A language model distilled and finetuned from an adapted version of LLaMA2-7B following "Towards the Law of Capacity Gap in Distilling Language Models".

Outperforming a wide range of 3B competitors in GPT4 evaluation and even competing with several 7B chat models.

<img src="./teaser_b.jpg" alt="teaser_b" width="687" />

The following is an example code snippet to use MiniChat-3B:

```python
import torch

from transformers import AutoModelForCausalLM, AutoTokenizer

from conversation import get_default_conv_template

# MiniChat
tokenizer = AutoTokenizer.from_pretrained("GeneZC/MiniChat-3B", use_fast=False)
# GPU.
model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="auto", torch_dtype=torch.float16).eval()
# CPU.
# model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="cpu", torch_dtype=torch.float16).eval()

conv = get_default_conv_template("minichat")

question = "Implement a program to find the common elements in two arrays without using any extra data structures."
conv.append_message(conv.roles[0], question)
conv.append_message(conv.roles[1], None)
prompt = conv.get_prompt()
input_ids = tokenizer([prompt]).input_ids
output_ids = model.generate(
    torch.as_tensor(input_ids).cuda(),
    do_sample=True,
    temperature=0.7,
    max_new_tokens=1024,
)
output_ids = output_ids[0][len(input_ids[0]):]
output = tokenizer.decode(output_ids, skip_special_tokens=True).strip()
# output: "def common_elements(arr1, arr2):\n    if len(arr1) == 0:\n        return []\n    if len(arr2) == 0:\n        return arr1\n\n    common_elements = []\n    for element in arr1:\n        if element in arr2:\n            common_elements.append(element)\n\n    return common_elements"
# Multiturn conversation could be realized by continuously appending questions to `conv`.
```

## Bibtex

```bibtex
@article{zhang2023law,
    title={Towards the Law of Capacity Gap in Distilling Language Models},
    author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},
    year={2023},
    url={https://arxiv.org/abs/2311.07052}
}
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_GeneZC__MiniChat-1.5-3B)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |50.23|
|AI2 Reasoning Challenge (25-Shot)|46.50|
|HellaSwag (10-Shot)              |68.28|
|MMLU (5-Shot)                    |46.67|
|TruthfulQA (0-shot)              |50.71|
|Winogrande (5-shot)              |65.04|
|GSM8k (5-shot)                   |24.18|
初始化项目，由ModelHub XC社区提供模型 Model: GeneZC/MiniChat-1.5-3B Source: Original Platform 2026-05-14 02:12:20 +08:00			`---`
			`language:`
			`- en`
			`- zh`
			`license: apache-2.0`
			`library_name: transformers`
			`widget:`
			`- text: <s> [\|User\|] Hi 👋 </s>[\|Assistant\|]`
			`model-index:`
			`- name: MiniChat-1.5-3B`
			`results:`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: AI2 Reasoning Challenge (25-Shot)`
			`type: ai2_arc`
			`config: ARC-Challenge`
			`split: test`
			`args:`
			`num_few_shot: 25`
			`metrics:`
			`- type: acc_norm`
			`value: 46.5`
			`name: normalized accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: HellaSwag (10-Shot)`
			`type: hellaswag`
			`split: validation`
			`args:`
			`num_few_shot: 10`
			`metrics:`
			`- type: acc_norm`
			`value: 68.28`
			`name: normalized accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: MMLU (5-Shot)`
			`type: cais/mmlu`
			`config: all`
			`split: test`
			`args:`
			`num_few_shot: 5`
			`metrics:`
			`- type: acc`
			`value: 46.67`
			`name: accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: TruthfulQA (0-shot)`
			`type: truthful_qa`
			`config: multiple_choice`
			`split: validation`
			`args:`
			`num_few_shot: 0`
			`metrics:`
			`- type: mc2`
			`value: 50.71`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: Winogrande (5-shot)`
			`type: winogrande`
			`config: winogrande_xl`
			`split: validation`
			`args:`
			`num_few_shot: 5`
			`metrics:`
			`- type: acc`
			`value: 65.04`
			`name: accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B`
			`name: Open LLM Leaderboard`
			`- task:`
			`type: text-generation`
			`name: Text Generation`
			`dataset:`
			`name: GSM8k (5-shot)`
			`type: gsm8k`
			`config: main`
			`split: test`
			`args:`
			`num_few_shot: 5`
			`metrics:`
			`- type: acc`
			`value: 24.18`
			`name: accuracy`
			`source:`
			`url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-1.5-3B`
			`name: Open LLM Leaderboard`
			`---`

			`## MiniChat-1.5-3B`

			`📑 [arXiv](https://arxiv.org/abs/2311.07052) \| 👻 [GitHub](https://github.com/GeneZC/MiniMA) \| 🤗 [HuggingFace-MiniMA](https://huggingface.co/GeneZC/MiniMA-3B) \| 🤗 [HuggingFace-MiniChat](https://huggingface.co/GeneZC/MiniChat-3B) \| 🤗 [HuggingFace-MiniChat-1.5](https://huggingface.co/GeneZC/MiniChat-1.5-3B) \| 🤖 [ModelScope-MiniMA](https://modelscope.cn/models/GeneZC/MiniMA-3B) \| 🤖 [ModelScope-MiniChat](https://modelscope.cn/models/GeneZC/MiniChat-3B)`

			`🆕 Updates from MiniChat-3B:`
			`- better data mixture;`
			`- use of [NEFTune](https://arxiv.org/abs/2310.05914);`
			`- use of [DPO](https://arxiv.org/abs/2305.18290).`

			`❗ Must comply with LICENSE of LLaMA2 since it is derived from LLaMA2.`

			`A language model distilled and finetuned from an adapted version of LLaMA2-7B following "Towards the Law of Capacity Gap in Distilling Language Models".`

			`Outperforming a wide range of 3B competitors in GPT4 evaluation and even competing with several 7B chat models.`

			`<img src="./teaser_b.jpg" alt="teaser_b" width="687" />`

			`The following is an example code snippet to use MiniChat-3B:`

			```python
			`import torch`

			`from transformers import AutoModelForCausalLM, AutoTokenizer`

			`from conversation import get_default_conv_template`

			`# MiniChat`
			`tokenizer = AutoTokenizer.from_pretrained("GeneZC/MiniChat-3B", use_fast=False)`
			`# GPU.`
			`model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="auto", torch_dtype=torch.float16).eval()`
			`# CPU.`
			`# model = AutoModelForCausalLM.from_pretrained("GeneZC/MiniChat-3B", use_cache=True, device_map="cpu", torch_dtype=torch.float16).eval()`

			`conv = get_default_conv_template("minichat")`

			`question = "Implement a program to find the common elements in two arrays without using any extra data structures."`
			`conv.append_message(conv.roles[0], question)`
			`conv.append_message(conv.roles[1], None)`
			`prompt = conv.get_prompt()`
			`input_ids = tokenizer([prompt]).input_ids`
			`output_ids = model.generate(`
			`torch.as_tensor(input_ids).cuda(),`
			`do_sample=True,`
			`temperature=0.7,`
			`max_new_tokens=1024,`
			`)`
			`output_ids = output_ids[0][len(input_ids[0]):]`
			`output = tokenizer.decode(output_ids, skip_special_tokens=True).strip()`
			`# output: "def common_elements(arr1, arr2):\n if len(arr1) == 0:\n return []\n if len(arr2) == 0:\n return arr1\n\n common_elements = []\n for element in arr1:\n if element in arr2:\n common_elements.append(element)\n\n return common_elements"`
			# Multiturn conversation could be realized by continuously appending questions to `conv`.
			```

			`## Bibtex`

			```bibtex
			`@article{zhang2023law,`
			`title={Towards the Law of Capacity Gap in Distilling Language Models},`
			`author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},`
			`year={2023},`
			`url={https://arxiv.org/abs/2311.07052}`
			`}`
			```
			`# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)`
			`Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_GeneZC__MiniChat-1.5-3B)`

			`\| Metric \|Value\|`
			`\|---------------------------------\|----:\|`
			`\|Avg. \|50.23\|`
			`\|AI2 Reasoning Challenge (25-Shot)\|46.50\|`
			`\|HellaSwag (10-Shot) \|68.28\|`
			`\|MMLU (5-Shot) \|46.67\|`
			`\|TruthfulQA (0-shot) \|50.71\|`
			`\|Winogrande (5-shot) \|65.04\|`
			`\|GSM8k (5-shot) \|24.18\|`