初始化项目,由ModelHub XC社区提供模型
Model: alexandreteles/bonito-v1-gguf Source: Original Platform
This commit is contained in:
42
.gitattributes
vendored
Normal file
42
.gitattributes
vendored
Normal file
@@ -0,0 +1,42 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
bonito-v1_q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
bonito-v1_f16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
bonito-v1_q6_k.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
bonito-v1_q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
bonito-v1_q5_k_s.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
bonito-v1_q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
bonito-v1_iq4_nl.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
98
README.md
Normal file
98
README.md
Normal file
@@ -0,0 +1,98 @@
|
||||
---
|
||||
datasets:
|
||||
- BatsResearch/ctga-v1
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
pipeline_tag: text2text-generation
|
||||
tags:
|
||||
- data generation
|
||||
license: apache-2.0
|
||||
---
|
||||
|
||||
# Bonito-v1 GGUF
|
||||
|
||||
You can find the original model at [BatsResearch/bonito-v1](https://huggingface.co/BatsResearch/bonito-v1)
|
||||
|
||||
## Variations
|
||||
|
||||
| Name | Quant method | Bits |
|
||||
| ---- | ---- | ---- |
|
||||
| [bonito-v1_iq4_nl.gguf](https://huggingface.co/alexandreteles/bonito-v1-gguf/blob/main/bonito-v1_iq4_nl.gguf) | IQ4_NL | 4 | 4.16 GB|
|
||||
| [bonito-v1_q4_k_m.gguf](https://huggingface.co/alexandreteles/bonito-v1-gguf/blob/main/bonito-v1_q4_k_m.gguf) | Q4_K_M | 4 | 4.37 GB|
|
||||
| [bonito-v1_q5_k_2.gguf](https://huggingface.co/alexandreteles/bonito-v1-gguf/blob/main/bonito-v1_q5_k_s.gguf) | Q5_K_S | 5 | 5.00 GB|
|
||||
| [bonito-v1_q5_k_m.gguf](https://huggingface.co/alexandreteles/bonito-v1-gguf/blob/main/bonito-v1_q5_k_m.gguf) | Q5_K_M | 5 | 5.13 GB|
|
||||
| [bonito-v1_q6_k.gguf](https://huggingface.co/alexandreteles/bonito-v1-gguf/blob/main/bonito-v1_q6_k.gguf) | Q6_K | 6 | 5.94 GB|
|
||||
| [bonito-v1_q8_0.gguf](https://huggingface.co/alexandreteles/bonito-v1-gguf/blob/main/bonito-v1_q8_0.gguf) | Q8_0 | 8 | 7.70 GB|
|
||||
| [bonito-v1_f16.gguf](https://huggingface.co/alexandreteles/bonito-v1-gguf/blob/main/bonito-v1_f16.gguf) | FP16 | 16 | 14.5 GB|
|
||||
|
||||
## Model Card for bonito
|
||||
|
||||
<!-- Provide a quick summary of what the model is/does. -->
|
||||
|
||||
Bonito is an open-source model for conditional task generation: the task of converting unannotated text into task-specific training datasets for instruction tuning.
|
||||
|
||||

|
||||
|
||||
## Model Details
|
||||
|
||||
### Model Description
|
||||
|
||||
<!-- Provide a longer summary of what this model is. -->
|
||||
|
||||
Bonito can be used to create synthetic instruction tuning datasets to adapt large language models on users' specialized, private data.
|
||||
In our [paper](https://github.com/BatsResearch/bonito), we show that Bonito can be used to adapt both pretrained and instruction tuned models to tasks without any annotations.
|
||||
|
||||
- **Developed by:** Nihal V. Nayak, Yiyang Nan, Avi Trost, and Stephen H. Bach
|
||||
- **Model type:** MistralForCausalLM
|
||||
- **Language(s) (NLP):** English
|
||||
- **License:** TBD
|
||||
- **Finetuned from model:** `mistralai/Mistral-7B-v0.1`
|
||||
|
||||
### Model Sources
|
||||
|
||||
<!-- Provide the basic links for the model. -->
|
||||
|
||||
- **Repository:** [https://github.com/BatsResearch/bonito](https://github.com/BatsResearch/bonito)
|
||||
- **Paper:** Arxiv link
|
||||
|
||||
## Uses
|
||||
|
||||
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
||||
|
||||
### Direct Use
|
||||
|
||||
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
||||
To easily generate synthetic instruction tuning datasets, we recommend using the [bonito](https://github.com/BatsResearch/bonito) package built using the `transformers` and the `vllm` libraries.
|
||||
|
||||
```python
|
||||
from bonito import Bonito, SamplingParams
|
||||
from datasets import load_dataset
|
||||
|
||||
# Initialize the Bonito model
|
||||
bonito = Bonito()
|
||||
|
||||
# load dataaset with unannotated text
|
||||
unannotated_text = load_dataset(
|
||||
"BatsResearch/bonito-experiment",
|
||||
"unannotated_contract_nli"
|
||||
)["train"].select(range(10))
|
||||
|
||||
# Generate synthetic instruction tuning dataset
|
||||
sampling_params = SamplingParams(max_tokens=256, top_p=0.95, temperature=0.5, n=1)
|
||||
synthetic_dataset = bonito.generate_tasks(
|
||||
unannotated_text,
|
||||
context_col="input",
|
||||
task_type="nli",
|
||||
sampling_params=sampling_params
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
### Out-of-Scope Use
|
||||
|
||||
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
||||
|
||||
Our model is trained to generate the following task types: summarization, sentiment analysis, multiple-choice question answering, extractive question answering, topic classification, natural language inference, question generation, text generation, question answering without choices, paraphrase identification, sentence completion, yes-no question answering, word sense disambiguation, paraphrase generation, textual entailment, and
|
||||
coreference resolution.
|
||||
The model might not produce accurate synthetic tasks beyond these task types.
|
||||
5
added_tokens.json
Normal file
5
added_tokens.json
Normal file
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"</s>": 2,
|
||||
"<s>": 1,
|
||||
"<unk>": 0
|
||||
}
|
||||
3
bonito-v1_f16.gguf
Normal file
3
bonito-v1_f16.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b7fee70d4f7a71e4ad20350996af5fbece4ce8b45883c4f0478174c44bc56e6c
|
||||
size 14484731456
|
||||
3
bonito-v1_iq4_nl.gguf
Normal file
3
bonito-v1_iq4_nl.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:664d99dfb50803ae680a2bf8f853a35f4d6b659954a88298b099d1eb702ea77a
|
||||
size 4155053696
|
||||
3
bonito-v1_q4_k_m.gguf
Normal file
3
bonito-v1_q4_k_m.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:40cbacae0bddf81c8ab51245028a82b3ba58835c1a9feee82e4db87677439695
|
||||
size 4368438912
|
||||
3
bonito-v1_q5_k_m.gguf
Normal file
3
bonito-v1_q5_k_m.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7f14c326226fa3ab5c9b0a86771aca28096892cb9ac8b775387e749c20002df5
|
||||
size 5131409024
|
||||
3
bonito-v1_q5_k_s.gguf
Normal file
3
bonito-v1_q5_k_s.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:76230d2604423c50f15e0268e9a1dea6f8215bec3533c86ffb1c12c091f231e6
|
||||
size 4997715584
|
||||
3
bonito-v1_q6_k.gguf
Normal file
3
bonito-v1_q6_k.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ed7c2e90e9be02bbeffebea0d83e59cce509d7c3cd0fe0cd72ed54b57e1dc9f8
|
||||
size 5942064768
|
||||
3
bonito-v1_q8_0.gguf
Normal file
3
bonito-v1_q8_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d457be242e703ede4debcdb42c0efa8dcef59931ac6bb34aed2a329ecb8d1bc5
|
||||
size 7695857280
|
||||
25
config.json
Normal file
25
config.json
Normal file
@@ -0,0 +1,25 @@
|
||||
{
|
||||
"_name_or_path": "mistralai/Mistral-7B-v0.1",
|
||||
"architectures": [
|
||||
"MistralForCausalLM"
|
||||
],
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 14336,
|
||||
"max_position_embeddings": 32768,
|
||||
"model_type": "mistral",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 32,
|
||||
"num_key_value_heads": 8,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_theta": 10000.0,
|
||||
"sliding_window": 4096,
|
||||
"tie_word_embeddings": false,
|
||||
"torch_dtype": "bfloat16",
|
||||
"transformers_version": "4.34.0",
|
||||
"use_cache": true,
|
||||
"vocab_size": 32000
|
||||
}
|
||||
6
generation_config.json
Normal file
6
generation_config.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"_from_model_config": true,
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"transformers_version": "4.34.0"
|
||||
}
|
||||
10
special_tokens_map.json
Normal file
10
special_tokens_map.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"additional_special_tokens": [
|
||||
"<unk>",
|
||||
"<s>",
|
||||
"</s>"
|
||||
],
|
||||
"bos_token": "<s>",
|
||||
"eos_token": "</s>",
|
||||
"unk_token": "<unk>"
|
||||
}
|
||||
91122
tokenizer.json
Normal file
91122
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
BIN
tokenizer.model
(Stored with Git LFS)
Normal file
BIN
tokenizer.model
(Stored with Git LFS)
Normal file
Binary file not shown.
44
tokenizer_config.json
Normal file
44
tokenizer_config.json
Normal file
@@ -0,0 +1,44 @@
|
||||
{
|
||||
"added_tokens_decoder": {
|
||||
"0": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"1": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"2": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [
|
||||
"<unk>",
|
||||
"<s>",
|
||||
"</s>"
|
||||
],
|
||||
"bos_token": "<s>",
|
||||
"clean_up_tokenization_spaces": false,
|
||||
"eos_token": "</s>",
|
||||
"legacy": true,
|
||||
"model_max_length": 1000000000000000019884624838656,
|
||||
"pad_token": null,
|
||||
"sp_model_kwargs": {},
|
||||
"spaces_between_special_tokens": false,
|
||||
"tokenizer_class": "LlamaTokenizer",
|
||||
"unk_token": "<unk>",
|
||||
"use_default_system_prompt": true
|
||||
}
|
||||
Reference in New Issue
Block a user