4680e5098150271a39c110f21234dbad483f37f4
Model: tartuNLP/Llama-2-7b-Ukrainian Source: Original Platform
library_name, license, datasets, language, pipeline_tag, base_model
| library_name | license | datasets | language | pipeline_tag | base_model | ||||
|---|---|---|---|---|---|---|---|---|---|
| transformers | llama2 |
|
|
text-generation |
|
Llama-2-7b-Ukrainian
Model Details
Model Description
Llama-2-7b-Ukrainian is a bilingual pre-trained model supporting Ukrainian and English. Continued pre-training from Llama-2-7b on 5B tokens consisting of 75% Ukrainian documents and 25% English documents from CulturaX.
Paper: To Err Is Human, but Llamas Can Learn It Too
Training Hyperparameters
| Hyperparameter | Value |
|---|---|
| Training steps | 19080 |
| Batch size | 256 |
| Weight decay | 0.1 |
| Context length | 1024 |
| Learning rate | 2e-5 linear decay to 2e-6 |
| Precision | bf16 |
| Optimizer | AdamW |
Citation
BibTeX:
@article{luhtaru2024err,
title={To Err Is Human, but Llamas Can Learn It Too},
author={Luhtaru, Agnes and Purason, Taido and Vainikko, Martin and Del, Maksym and Fishel, Mark},
journal={arXiv preprint arXiv:2403.05493},
year={2024}
}
Description