初始化项目,由ModelHub XC社区提供模型
Model: Edentns/DataVortexTL-1.1B-v0.1 Source: Original Platform
This commit is contained in:
35
.gitattributes
vendored
Normal file
35
.gitattributes
vendored
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
BIN
DataVortex.png
Normal file
BIN
DataVortex.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 3.5 KiB |
132
README.md
Normal file
132
README.md
Normal file
@@ -0,0 +1,132 @@
|
|||||||
|
---
|
||||||
|
tags:
|
||||||
|
- text-generation
|
||||||
|
license: cc-by-nc-sa-4.0
|
||||||
|
language:
|
||||||
|
- ko
|
||||||
|
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
|
||||||
|
pipeline_tag: text-generation
|
||||||
|
datasets:
|
||||||
|
- beomi/KoAlpaca-v1.1a
|
||||||
|
- jojo0217/korean_rlhf_dataset
|
||||||
|
- kyujinpy/OpenOrca-KO
|
||||||
|
- nlpai-lab/kullm-v2
|
||||||
|
widget:
|
||||||
|
- text: >
|
||||||
|
<|system|>
|
||||||
|
|
||||||
|
You are a chatbot who answers User's questions.
|
||||||
|
|
||||||
|
<|user|>
|
||||||
|
|
||||||
|
대한민국의 수도는 어디야?
|
||||||
|
|
||||||
|
<|assistant|>
|
||||||
|
---
|
||||||
|
|
||||||
|
# **DataVortexTL-1.1B-v0.1**
|
||||||
|
|
||||||
|
<img src="./DataVortex.png" alt="DataVortex" style="height: 8em;">
|
||||||
|
|
||||||
|
## Our Team
|
||||||
|
|
||||||
|
| Research & Engineering | Product Management |
|
||||||
|
| :--------------------: | :----------------: |
|
||||||
|
| Kwangseok Yang | Seunghyun Choi |
|
||||||
|
| Jeongwon Choi | Hyoseok Choi |
|
||||||
|
|
||||||
|
## **Model Details**
|
||||||
|
|
||||||
|
### **Base Model**
|
||||||
|
|
||||||
|
[TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
|
||||||
|
|
||||||
|
### **Trained On**
|
||||||
|
|
||||||
|
- **OS**: Ubuntu 20.04
|
||||||
|
- **GPU**: H100 80GB 1ea
|
||||||
|
- **transformers**: v4.36.2
|
||||||
|
|
||||||
|
### **Dataset**
|
||||||
|
|
||||||
|
- [beomi/KoAlpaca-v1.1a](https://huggingface.co/datasets/beomi/KoAlpaca-v1.1a)
|
||||||
|
- [jojo0217/korean_rlhf_dataset](https://huggingface.co/datasets/jojo0217/korean_rlhf_dataset)
|
||||||
|
- [kyujinpy/OpenOrca-KO](https://huggingface.co/datasets/kyujinpy/OpenOrca-KO)
|
||||||
|
- [nlpai-lab/kullm-v2](https://huggingface.co/datasets/nlpai-lab/kullm-v2)
|
||||||
|
|
||||||
|
### **Instruction format**
|
||||||
|
|
||||||
|
It follows **TinyLlama** format.
|
||||||
|
|
||||||
|
E.g.
|
||||||
|
|
||||||
|
```python
|
||||||
|
text = """\
|
||||||
|
<|system|>
|
||||||
|
당신은 사람들이 정보를 찾을 수 있도록 도와주는 인공지능 비서입니다.</s>
|
||||||
|
<|user|>
|
||||||
|
대한민국의 수도는 어디야?</s>
|
||||||
|
<|assistant|>
|
||||||
|
대한민국의 수도는 서울입니다.</s>
|
||||||
|
<|user|>
|
||||||
|
서울 인구는 총 몇 명이야?</s>
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
## **Model Benchmark**
|
||||||
|
|
||||||
|
### **[Ko LM Eval Harness](https://github.com/Beomi/ko-lm-evaluation-harness)**
|
||||||
|
|
||||||
|
| Task | 0-shot | 5-shot | 10-shot | 50-shot |
|
||||||
|
| :--------------- | -------------: | -------------: | -------------: | -----------: |
|
||||||
|
| kobest_boolq | 0.334282 | 0.516446 | 0.500478 | 0.498941 |
|
||||||
|
| kobest_copa | 0.515061 | 0.504321 | 0.492927 | 0.50809 |
|
||||||
|
| kobest_hellaswag | 0.36253 | 0.357733 | 0.355873 | 0.376502 |
|
||||||
|
| kobest_sentineg | 0.481146 | 0.657411 | 0.687417 | 0.635703 |
|
||||||
|
| **Average** | **0.42325475** | **0.50897775** | **0.50917375** | **0.504809** |
|
||||||
|
|
||||||
|
### **[Ko-LLM-Leaderboard](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard)**
|
||||||
|
|
||||||
|
| Average | Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 |
|
||||||
|
| ------: | -----: | -----------: | ------: | ------------: | --------------: |
|
||||||
|
| 31.5 | 25.26 | 33.53 | 24.56 | 43.34 | 30.81 |
|
||||||
|
|
||||||
|
## **Implementation Code**
|
||||||
|
|
||||||
|
This model contains the chat_template instruction format.
|
||||||
|
You can use the code below.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||||
|
|
||||||
|
device = "cuda" # the device to load the model onto
|
||||||
|
|
||||||
|
model = AutoModelForCausalLM.from_pretrained("Edentns/DataVortexTL-1.1B-v0.1")
|
||||||
|
tokenizer = AutoTokenizer.from_pretrained("Edentns/DataVortexTL-1.1B-v0.1")
|
||||||
|
|
||||||
|
messages = [
|
||||||
|
{"role": "system", "content": "당신은 사람들이 정보를 찾을 수 있도록 도와주는 인공지능 비서입니다."},
|
||||||
|
{"role": "user", "content": "대한민국의 수도는 어디야?"},
|
||||||
|
{"role": "assistant", "content": "대한민국의 수도는 서울입니다."},
|
||||||
|
{"role": "user", "content": "서울 인구는 총 몇 명이야?"}
|
||||||
|
]
|
||||||
|
|
||||||
|
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
||||||
|
|
||||||
|
model_inputs = encodeds.to(device)
|
||||||
|
model.to(device)
|
||||||
|
|
||||||
|
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
|
||||||
|
decoded = tokenizer.batch_decode(generated_ids)
|
||||||
|
print(decoded[0])
|
||||||
|
```
|
||||||
|
|
||||||
|
## **License**
|
||||||
|
|
||||||
|
The model is licensed under the [cc-by-nc-sa-4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license, which allows others to copy, modify, and share the work non-commercially, as long as they give appropriate credit and distribute any derivative works under the same license.
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<a href="https://edentns.com/">
|
||||||
|
<img src="./Logo.png" alt="Logo" style="height: 3em;">
|
||||||
|
</a>
|
||||||
|
</div>
|
||||||
28
config.json
Normal file
28
config.json
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
{
|
||||||
|
"_name_or_path": "Edentns/DataVortexTL-1.1B-v0.1",
|
||||||
|
"architectures": [
|
||||||
|
"LlamaForCausalLM"
|
||||||
|
],
|
||||||
|
"attention_bias": false,
|
||||||
|
"attention_dropout": 0.0,
|
||||||
|
"bos_token_id": 1,
|
||||||
|
"eos_token_id": 2,
|
||||||
|
"hidden_act": "silu",
|
||||||
|
"hidden_size": 2048,
|
||||||
|
"initializer_range": 0.02,
|
||||||
|
"intermediate_size": 5632,
|
||||||
|
"max_position_embeddings": 2048,
|
||||||
|
"model_type": "llama",
|
||||||
|
"num_attention_heads": 32,
|
||||||
|
"num_hidden_layers": 22,
|
||||||
|
"num_key_value_heads": 4,
|
||||||
|
"pretraining_tp": 1,
|
||||||
|
"rms_norm_eps": 1e-05,
|
||||||
|
"rope_scaling": null,
|
||||||
|
"rope_theta": 10000.0,
|
||||||
|
"tie_word_embeddings": false,
|
||||||
|
"torch_dtype": "bfloat16",
|
||||||
|
"transformers_version": "4.36.2",
|
||||||
|
"use_cache": true,
|
||||||
|
"vocab_size": 32000
|
||||||
|
}
|
||||||
7
generation_config.json
Normal file
7
generation_config.json
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
{
|
||||||
|
"bos_token_id": 1,
|
||||||
|
"eos_token_id": 2,
|
||||||
|
"max_length": 2048,
|
||||||
|
"pad_token_id": 0,
|
||||||
|
"transformers_version": "4.36.2"
|
||||||
|
}
|
||||||
3
model.safetensors
Normal file
3
model.safetensors
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:d1e85fb779e33786c72ae49ce2d557849ca75e3f510ddbe0a6d62b3e993554d2
|
||||||
|
size 2200119864
|
||||||
24
special_tokens_map.json
Normal file
24
special_tokens_map.json
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
{
|
||||||
|
"bos_token": {
|
||||||
|
"content": "<s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"eos_token": {
|
||||||
|
"content": "</s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"pad_token": "</s>",
|
||||||
|
"unk_token": {
|
||||||
|
"content": "<unk>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
}
|
||||||
|
}
|
||||||
93391
tokenizer.json
Normal file
93391
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
3
tokenizer.model
Normal file
3
tokenizer.model
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
|
||||||
|
size 499723
|
||||||
42
tokenizer_config.json
Normal file
42
tokenizer_config.json
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
{
|
||||||
|
"add_bos_token": true,
|
||||||
|
"add_eos_token": false,
|
||||||
|
"added_tokens_decoder": {
|
||||||
|
"0": {
|
||||||
|
"content": "<unk>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"1": {
|
||||||
|
"content": "<s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"2": {
|
||||||
|
"content": "</s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"bos_token": "<s>",
|
||||||
|
"chat_template": "{% for message in messages %}\n{% if message['role'] == 'user' %}\n{{ '<|user|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'system' %}\n{{ '<|system|>\n' + message['content'] + eos_token }}\n{% elif message['role'] == 'assistant' %}\n{{ '<|assistant|>\n' + message['content'] + eos_token }}\n{% endif %}\n{% if loop.last and add_generation_prompt %}\n{{ '<|assistant|>' }}\n{% endif %}\n{% endfor %}",
|
||||||
|
"clean_up_tokenization_spaces": false,
|
||||||
|
"eos_token": "</s>",
|
||||||
|
"legacy": false,
|
||||||
|
"model_max_length": 2048,
|
||||||
|
"pad_token": "</s>",
|
||||||
|
"padding_side": "right",
|
||||||
|
"sp_model_kwargs": {},
|
||||||
|
"tokenizer_class": "LlamaTokenizer",
|
||||||
|
"unk_token": "<unk>",
|
||||||
|
"use_default_system_prompt": false
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user