Align tokenizer with mistral-common (#45)

- Align tokenizer with mistral-common (53f216c52ce4534a38a71c21861acd514fa8a904)
- Defend the honour of the Hugging Face tokenizer (684c1751c210aa11e0b187c0eac1b7b2bd4d7967)
- Update to tokenizer v3 with correct proper special tokens (106a1b0c338ddbd0e3e42dbeb63634bc85d6f71b)
- Re-add chat template (3256c7e7ea279386e0cdd18553202ed78c4d735b)

Co-authored-by: Matthew Carrigan <Rocketknight1@users.noreply.huggingface.co>
This commit is contained in:
ai-modelscope
2024-07-31 08:03:57 +08:00
parent 634e800d57
commit 04ccb687cb
44 changed files with 13150 additions and 48 deletions

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:37f00374dea48658ee8f5d0f21895b9bc55cb0103939607c8185bfd1c6ca1f89
size 587404
oid sha256:9addc8bdce5988448ae81b729336f43a81262160ae8da760674badab9d4c7d33
size 587591