Llama-2-7b-Ukrainian is a bilingual pre-trained model supporting Ukrainian and English.
Continued pre-training from Llama-2-7b on 5B tokens
consisting of 75% Ukrainian documents and 25% English documents from CulturaX.
@article{luhtaru2024err,
title={To Err Is Human, but Llamas Can Learn It Too},
author={Luhtaru, Agnes and Purason, Taido and Vainikko, Martin and Del, Maksym and Fishel, Mark},
journal={arXiv preprint arXiv:2403.05493},
year={2024}
}