LJVM and AK acknowledge the support of the UKRI Frontier Grant EP/Y031350/1 (EQUATE).
This work was performed using joint resources provided by the Cambridge Service for Data Driven Discovery (CSD3) EP/T022159/1 and the Isambard AI National AI Research Resource (AIRR) ST/AIRR/I-A-I/1023, and the Microsoft Research Grant.
LJVM would also like to thank Songbo Hu, Chen Cecilia Liu, Millicent Ochieng, and Felermino Ali for helpful and productive discussions on the project.
Citation
@misc{miranda2026polyglotteachersevaluatinglanguage,title={Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation},author={Lester James V. Miranda and Ivan Vulić and Anna Korhonen},year={2026},eprint={2604.11290},archivePrefix={arXiv},primaryClass={cs.CL},url={https://arxiv.org/abs/2604.11290},}