Files
llama3.2-typhoon2-1b/README.md

54 lines
2.5 KiB
Markdown
Raw Permalink Normal View History

---
license: llama3.2
pipeline_tag: text-generation
---
**Llama3.2-Typhoon2-1B**: Thai Large Language Model (Instruct)
**Llama3.2-Typhoon2-1B** is a pretrained only Thai 🇹🇭 large language model with 1 billion parameters, and it is based on Llama3.2-1B.
For technical-report. please see our [arxiv](https://arxiv.org/abs/2412.13702).
*To acknowledge Meta's effort in creating the foundation model and to comply with the license, we explicitly include "llama-3.2" in the model name.
## **Performance**
| Model | ThaiExam | ONET | IC | A-Level | TGAT | TPAT | M3Exam | Math | Science | Social | Thai |
|------------------------|-----------|----------|-----------|-----------|-----------|-----------|-----------|------------|------------|------------|------------|
| **Typhoon2 Llama3.2 1B Base** | **26.83%** | **19.75%** | 16.84% | 17.32% | **49.23%** | **31.03%** | **26.10%** | 21.71% | **25.60%** | **32.83%** | 24.27% |
| **Llama3.1 1B** | 25.38% | 18.51% | **20.00%** | **26.77%** | 32.30% | 29.31% | 25.30% | **23.52%** | 25.36% | 27.48% | **24.82%** |
## **Model Description**
- **Model type**: A 1B decoder-only model based on Llama architecture.
- **Requirement**: transformers 4.45.0 or newer.
- **Primary Language(s)**: Thai 🇹🇭 and English 🇬🇧
- **License**: [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
## **Intended Uses & Limitations**
This model is a pretrained base model. Thus, it may not be able to follow human instructions without using one/few-shot learning or instruction fine-tuning. The model does not have any moderation mechanisms, and may generate harmful or inappropriate responses.
## **Follow us**
**https://twitter.com/opentyphoon**
## **Support**
**https://discord.gg/us5gAYmrxw**
## **Citation**
- If you find Typhoon2 useful for your work, please cite it using:
```
@misc{typhoon2,
title={Typhoon 2: A Family of Open Text and Multimodal Thai Large Language Models},
author={Kunat Pipatanakul and Potsawee Manakul and Natapong Nitarach and Warit Sirichotedumrong and Surapon Nonesung and Teetouch Jaknamon and Parinthapat Pengpun and Pittawat Taveekitworachai and Adisai Na-Thalang and Sittipong Sripaisarnmongkol and Krisanapong Jirayoot and Kasima Tharnpipitchai},
year={2024},
eprint={2412.13702},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.13702},
}
```