- Avoid output like : `[' K', '<0x64>', '<0x79>', 'ť', ' a', '<0x75>', 'to', 'bu', '<0x73>', '<0x75>', ... ]` with regular 500 BPE units. - Don't rewrite 1-char tokens in range [ 0x20 (space) .. 0x7E (tilde) ]
- Avoid output like : `[' K', '<0x64>', '<0x79>', 'ť', ' a', '<0x75>', 'to', 'bu', '<0x73>', '<0x75>', ... ]` with regular 500 BPE units. - Don't rewrite 1-char tokens in range [ 0x20 (space) .. 0x7E (tilde) ]