初始化项目,由ModelHub XC社区提供模型
Model: bobofrut/ladybird-base-7B-v8 Source: Original Platform
This commit is contained in:
35
.gitattributes
vendored
Normal file
35
.gitattributes
vendored
Normal file
@@ -0,0 +1,35 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
87
README.md
Normal file
87
README.md
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
language:
|
||||
- en
|
||||
tags:
|
||||
- mistral
|
||||
- text-generation-inference
|
||||
- conversational
|
||||
- finetuned
|
||||
---
|
||||
|
||||
# Ladybird-base-7B-v8
|
||||
|
||||
Welcome to the repository of Ladybird-base-7B-v8, a cutting-edge Large Language Model (LLM) developed as a result of extensive research and learning in the field of Artificial Intelligence (AI), particularly focusing on LLMs. This model represents a significant milestone in my journey to understand and contribute to the advancement of AI technologies.
|
||||
|
||||
## About the Creator
|
||||
|
||||
As an avid learner and researcher of AI, I embarked on the journey to not only understand but also to contribute to the field of Large Language Models. Building and fine-tuning my own models allowed me to deeply engage with the intricacies of AI, culminating in the development of the Ladybird-base-7B-v8. This project is a testament to my dedication to learning and my passion for pushing the boundaries of what AI models can achieve.
|
||||
|
||||
## Model Overview
|
||||
|
||||
Ladybird-base-7B-v8 is based on the Mistral architecture, which is known for its efficiency and effectiveness in handling complex language understanding and generation tasks. The model incorporates several innovative architecture choices to enhance its performance:
|
||||
|
||||
- **Grouped-Query Attention**: Optimizes attention mechanisms by grouping queries, reducing computational complexity while maintaining model quality.
|
||||
- **Sliding-Window Attention**: Improves the model's ability to handle long-range dependencies by focusing on relevant segments of input, enhancing understanding and coherence.
|
||||
- **Byte-fallback BPE Tokenizer**: Offers robust tokenization by combining the effectiveness of Byte-Pair Encoding (BPE) with a fallback mechanism for out-of-vocabulary bytes, ensuring comprehensive language coverage.
|
||||
|
||||
## Instruction Format
|
||||
|
||||
To fully leverage the capabilities of Ladybird-base-7B-v8, especially its instruction fine-tuning feature, users are advised to follow [ChatML](https://huggingface.co/docs/transformers/main/en/chat_templating) format. This format ensures that prompts are effectively processed, resulting in accurate and context-aware responses from the model. Here's how to construct your prompts:
|
||||
|
||||
```python
|
||||
msg = [
|
||||
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
|
||||
{"role": "assistant", "content": "You are a friendly chatbot who always responds in the style of a pirate"},
|
||||
]
|
||||
|
||||
prompt = pipe.tokenizer.apply_chat_template(msg, tokenize=False, add_generation_prompt=True)
|
||||
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Eval results
|
||||
|
||||
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
||||
|-------------------------------|-------|----------------|------|-----------|-----:|---|-----:|
|
||||
|winogrande | 1|none |None |acc |0.8272|± |0.0106|
|
||||
|truthfulqa_mc2 | 2|none |0 |acc |0.7736|± |0.0139|
|
||||
|truthfulqa_mc1 | 2|none |0 |acc |0.6242|± |0.0170|
|
||||
|stem |N/A |none |None |acc |0.5109|± |0.0085|
|
||||
| - abstract_algebra | 0|none |None |acc |0.2900|± |0.0456|
|
||||
| - anatomy | 0|none |None |acc |0.5852|± |0.0426|
|
||||
| - astronomy | 0|none |None |acc |0.6908|± |0.0376|
|
||||
| - college_biology | 0|none |None |acc |0.6875|± |0.0388|
|
||||
| - college_chemistry | 0|none |None |acc |0.4000|± |0.0492|
|
||||
| - college_computer_science | 0|none |None |acc |0.5300|± |0.0502|
|
||||
| - college_mathematics | 0|none |None |acc |0.2600|± |0.0441|
|
||||
| - college_physics | 0|none |None |acc |0.4314|± |0.0493|
|
||||
| - computer_security | 0|none |None |acc |0.7100|± |0.0456|
|
||||
| - conceptual_physics | 0|none |None |acc |0.5702|± |0.0324|
|
||||
| - electrical_engineering | 0|none |None |acc |0.5586|± |0.0414|
|
||||
| - elementary_mathematics | 0|none |None |acc |0.4259|± |0.0255|
|
||||
| - high_school_biology | 0|none |None |acc |0.7710|± |0.0239|
|
||||
| - high_school_chemistry | 0|none |None |acc |0.4483|± |0.0350|
|
||||
| - high_school_computer_science| 0|none |None |acc |0.7000|± |0.0461|
|
||||
| - high_school_mathematics | 0|none |None |acc |0.3259|± |0.0286|
|
||||
| - high_school_physics | 0|none |None |acc |0.3179|± |0.0380|
|
||||
| - high_school_statistics | 0|none |None |acc |0.4491|± |0.0339|
|
||||
| - machine_learning | 0|none |None |acc |0.5000|± |0.0475|
|
||||
|hellaswag | 1|none |None |acc |0.7010|± |0.0046|
|
||||
| | |none |None |acc_norm |0.8763|± |0.0033|
|
||||
|gsm8k | 3|strict-match |5 |exact_match|0.7650|± |0.0117|
|
||||
| | |flexible-extract|5 |exact_match|0.7695|± |0.0116|
|
||||
|arc_challenge | 1|none |None |acc |0.6749|± |0.0137|
|
||||
| | |none |None |acc_norm |0.6800|± |0.0136|
|
||||
|
||||
|
||||
|
||||
|
||||
### Contact
|
||||
|
||||
---
|
||||
jackiewicz.raf@gmail.com
|
||||
|
||||
---
|
||||
|
||||
25
config.json
Normal file
25
config.json
Normal file
@@ -0,0 +1,25 @@
|
||||
{
|
||||
"architectures": [
|
||||
"MistralForCausalLM"
|
||||
],
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 14336,
|
||||
"max_position_embeddings": 32768,
|
||||
"model_type": "mistral",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 32,
|
||||
"num_key_value_heads": 8,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_theta": 10000.0,
|
||||
"sliding_window": 4096,
|
||||
"tie_word_embeddings": false,
|
||||
"torch_dtype": "bfloat16",
|
||||
"transformers_version": "4.38.2",
|
||||
"use_cache": true,
|
||||
"vocab_size": 32000
|
||||
}
|
||||
3
model-00001-of-00002.safetensors
Normal file
3
model-00001-of-00002.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:4cd9bab6e57d62cf77fc6ac5a2c1a12320facf22aaee99aac0fbcf8844168b7d
|
||||
size 9825524456
|
||||
3
model-00002-of-00002.safetensors
Normal file
3
model-00002-of-00002.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5a077459594d21fce62b4a5c2b98c822a0c61d838da485ff45163918668a54bd
|
||||
size 4657973592
|
||||
1
model.safetensors.index.json
Normal file
1
model.safetensors.index.json
Normal file
File diff suppressed because one or more lines are too long
35
special_tokens_map.json
Normal file
35
special_tokens_map.json
Normal file
@@ -0,0 +1,35 @@
|
||||
{
|
||||
"additional_special_tokens": [
|
||||
"<unk>",
|
||||
"<s>",
|
||||
"</s>"
|
||||
],
|
||||
"bos_token": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"eos_token": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"unk_token": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
91122
tokenizer.json
Normal file
91122
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
49
tokenizer_config.json
Normal file
49
tokenizer_config.json
Normal file
@@ -0,0 +1,49 @@
|
||||
{
|
||||
"add_bos_token": true,
|
||||
"add_eos_token": false,
|
||||
"added_tokens_decoder": {
|
||||
"0": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"1": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"2": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [
|
||||
"<unk>",
|
||||
"<s>",
|
||||
"</s>"
|
||||
],
|
||||
"bos_token": "<s>",
|
||||
"chat_template": "{% for message in messages %}{{bos_token + message['role'] + '\n' + message['content'] + eos_token + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ bos_token + 'assistant\n' }}{% endif %}",
|
||||
"clean_up_tokenization_spaces": false,
|
||||
"eos_token": "</s>",
|
||||
"legacy": true,
|
||||
"model_max_length": 8192,
|
||||
"pad_token": "</s>",
|
||||
"padding_side": "left",
|
||||
"sp_model_kwargs": {},
|
||||
"spaces_between_special_tokens": false,
|
||||
"split_special_tokens": false,
|
||||
"tokenizer_class": "LlamaTokenizer",
|
||||
"unk_token": "<unk>",
|
||||
"use_default_system_prompt": true
|
||||
}
|
||||
Reference in New Issue
Block a user