初始化项目,由ModelHub XC社区提供模型

Model: Columbia-NLP/LION-Gemma-2b-odpo-v1.0
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-28 23:56:19 +08:00
commit a1988ac48a
11 changed files with 468 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

108
README.md Normal file
View File

@@ -0,0 +1,108 @@
---
library_name: transformers
tags: []
---
# Model Card for LION-Gemma-2b-odpo-v1.0
The LION-series are trained using an **empirically optimized pipeline** that consists of three stages: SFT, DPO, and online preference learning (online DPO). We find simple techniques such as sequence packing, loss masking in SFT, increasing the preference dataset size in DPO, and online DPO training can significantly improve the performance of language models. Our best models (the LION-series) **exceed the performance of the official instruct models** tuned with closed-source data and algorithms.
For training datasets, code, and evaluation scripts, please refer to our [paper](https://arxiv.org/abs/2407.06542) and [codebase](https://github.com/Columbia-NLP-Lab/LionAlignment).
## Model description
This model is finetuned from [`Columbia-NLP/LION-Gemma-2b-dpo-v1.0`](https://huggingface.co/Columbia-NLP/LION-Gemma-2b-dpo-v1.0) using online DPO from the LION pipeline.
- **Model type:** [`gemma-2b`](https://huggingface.co/google/gemma-2b)
- **Language(s) (NLP):** Primarily English
- **License:** Gemma Terms of Use
- **Finetuned from model:** [`Columbia-NLP/LION-Gemma-2b-dpo-v1.0`](https://huggingface.co/Columbia-NLP/LION-Gemma-2b-dpo-v1.0)
## Performance
| Model | Method | Size | Arena-Hard | AlpacaEval-2 | MT-Bench | OpenLLM |
|-------------|--------|------|------:|------:|---------:|-------:|
|[Gemma-2b](https://huggingface.co/google/gemma-2b) | - | 2B | - | - | - | 46.69 |
|[Gemma-2b-it](https://huggingface.co/google/gemma-2b-it) | SFT+RLHF | 2B | 3.4 | 5.44 | 5.63 | 42.75 |
|[Gemma-2b-zephyr](https://huggingface.co/wandb/gemma-2b-zephyr-dpo) | SFT+DPO | 2B | 0.9 | 2.65 | 4.13 | 46.92 |
|[LLaMA-2-7b-chat](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) | SFT | 7B | 4.6 | 5.35 | 6.22 | 53.16 |
|[Vicuna-7b-v1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5) | SFT | 7B | 2.5 | 7.62 | 6.57 | 52.06 |
|[LION-Gemma-2b-sft-v1.0 (ours)](https://huggingface.co/Columbia-NLP/LION-Gemma-2b-sft-v1.0) | SFT | 2B | 2.4 | 7.79 | 6.37 | 54.78 |
|[LION-Gemma-2b-dpo-v1.0 (ours)](https://huggingface.co/Columbia-NLP/LION-Gemma-2b-dpo-v1.0) | SFT+DPO | 2B | 4.6 | 8.75 | 6.58 | 55.35 |
|⮕ [LION-Gemma-2b-odpo-v1.0 (ours)](https://huggingface.co/Columbia-NLP/LION-Gemma-2b-odpo-v1.0) | SFT+DPO+ODPO | 2B | 5.0 | 9.57 | 6.75 | 55.98 |
## Intended uses
To ensure reproducibility, please use the following chat templates:
```python
import torch
from transformers import pipeline
pipe = pipeline(
"text-generation",
model="Columbia-NLP/LION-Gemma-2b-odpo-v1.0",
device_map="auto",
torch_dtype=torch.bfloat16,
)
messages = [
{
"role": "system",
"content": "",
},
{
"role": "user",
"content": "Write a short paragraph where every sentence start with the letter A."
},
]
outputs = pipe(
messages,
max_new_tokens=128,
do_sample=True,
temperature=0.7,
top_p=0.7,
stop_sequence="<|im_end|>",
)
print(outputs[0]["generated_text"][-1]["content"])
# Alice always aspired to acquire ample adventure.
# Astonishingly, amid abundant allurements, Alice allocated ample attention to each activity, ensuring an array of adventures.
# Albeit anxieties arose, Alice assuaged them with affirmations, ardently advancing ambitiously towards an array of adventures.
```
to inspect the chat template/manually do generation:
```python
tokenizer = AutoTokenizer.from_pretrained("Columbia-NLP/LION-Gemma-2b-odpo-v1.0")
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(prompt)
# tokenize prompt and use model.generate
```
### Training details
Please refer to our [paper](https://arxiv.org/abs/2407.06542) and [codebase](https://github.com/Columbia-NLP-Lab/LionAlignment).
## Citation Information
If you find this model useful in your work, please consider citing our paper:
```
@misc{yu2024lionsempiricallyoptimizedapproach,
title={LIONs: An Empirically Optimized Approach to Align Language Models},
author={Xiao Yu and Qingyang Wu and Yu Li and Zhou Yu},
year={2024},
eprint={2407.06542},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2407.06542},
}
```
## Acknowledgements
We thank the Columbia-NLP group and [articulate.ai](https://www.articulateai.com/) for providing OpenAI API credits and computational resources to conduct our experiments.

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"_name_or_path": "../model_checkpoints_coffee/oil/lion-gemma-v0.1/gemma-2b-lion-v0.6-165k-2epoch-online-60k-bigmix-beta0.05-epoch1-bsz64-zero1",
"architectures": [
"GemmaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 2,
"eos_token_id": 1,
"head_dim": 256,
"hidden_act": "gelu",
"hidden_activation": null,
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 16384,
"max_position_embeddings": 8192,
"model_type": "gemma",
"num_attention_heads": 8,
"num_hidden_layers": 18,
"num_key_value_heads": 1,
"pad_token_id": 0,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 10000.0,
"torch_dtype": "float32",
"transformers_version": "4.39.1",
"use_cache": false,
"vocab_size": 256000
}

8
generation_config.json Normal file
View File

@@ -0,0 +1,8 @@
{
"_from_model_config": true,
"bos_token_id": 2,
"eos_token_id": 1,
"pad_token_id": 0,
"transformers_version": "4.39.1",
"use_cache": false
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7e9bb09bd06490787ae2ef476a8d241a43b884503ec1a651819658e62b8c37e0
size 4911635192

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fcfbff053edb3288072d5eac12f4b179e4daeb60d5c230c9e8b479a0ce26a768
size 4978830584

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4bb7830d254efd1ed4ca015a94880aba249eb980123190d1aa182369bb7a7c1e
size 134242760

View File

@@ -0,0 +1,171 @@
{
"metadata": {
"total_size": 10024689664
},
"weight_map": {
"model.embed_tokens.weight": "model-00001-of-00003.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.8.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.9.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.norm.weight": "model-00003-of-00003.safetensors"
}
}

34
special_tokens_map.json Normal file
View File

@@ -0,0 +1,34 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>"
],
"bos_token": {
"content": "<bos>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<eos>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:22449cb9ef4bad0db7dd93b46ddff7ab7d6a654dd4f903e130ddb6361eac3af5
size 17477473

70
tokenizer_config.json Normal file
View File

@@ -0,0 +1,70 @@
{
"add_bos_token": false,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<eos>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "<bos>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"3": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"106": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"107": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>"
],
"bos_token": "<bos>",
"chat_template": "{% if messages[0]['role'] == 'user' or messages[0]['role'] == 'system' %}{{ bos_token }}{% endif %}{% for message in messages %}{{ '<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% elif messages[-1]['role'] == 'assistant' %}{{ eos_token }}{% endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "<eos>",
"legacy": null,
"model_max_length": 8192,
"pad_token": "<pad>",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "GemmaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}