* add conversion for bge-m3; small fix in unigram tokenizer * clean up and simplify XLMRoberta conversion