We introduce mCoT, a 7B parameter model for multilingual math reasoning that achieves impressive multilingual reasoning consistency across multiple languages.
Based on Mistral-7B-v0.1, mCoT is trained on mCoT-MATH, the first large-scale multilingual math CoT reasoning dataset containing around 6.3 million samples for 11 diverse languages.
# Templatetemplate="Question: \n{question} \nAnswer: \n{language}\n"# Language promptbn="আসুন ধাপে ধাপে চিন্তা করি।"de="Denken wir Schritt für Schritt."en="Let's think step by step."es="Pensemos paso a paso."fr="Réfléchissons étape par étape."ja="段階的に考えてみましょう。"ru="Давайте думать поэтапно."sw="Hebu fikiria hatua kwa hatua."te="అంచెలంచెలుగా ఆలోచిద్దాం."th="ลองคิดทีละขั้นตอน"zh="让我们一步步思考。"# Math questionmath_en="A robe takes 2 bolts of blue fiber and half that much white fiber. How many bolts in total does it take?"# An example for the English questionprompt= template.format(question=math_en, language=en)
Citation
If you use any content from this repository, please cite our paper:
@inproceedings{lai-etal-2024-mcot,
title = "mCoT: Multilingual Instruction Tuning for Reasoning Consistency
in Language Models",
author = "Lai, Huiyuan and Nissim, Malvina",
booktitle = "Proceedings of the 62nd Annual Meeting of the Association
for Computational Linguistics,
month = aug,
address = "Bangkok, Thailand",
year = "2024",
publisher = "Association for Computational Linguistics"
}