初始化项目,由ModelHub XC社区提供模型
Model: MayaPH/FinOPT-Lincoln Source: Original Platform
This commit is contained in:
34
.gitattributes
vendored
Normal file
34
.gitattributes
vendored
Normal file
@@ -0,0 +1,34 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
88
README.md
Normal file
88
README.md
Normal file
@@ -0,0 +1,88 @@
|
||||
---
|
||||
license: cc-by-sa-4.0
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# 🤗 FinOPT-Lincoln
|
||||
Released June 1, 2023
|
||||
|
||||
## Model Description
|
||||
FinOPT-Lincoln is a language model based on the OPT-350M architecture, which has been fine-tuned on a financial question-answering dataset. The model aims to provide accurate and informative responses to financial-related questions.
|
||||
|
||||
## FinOPT Series
|
||||
The FinOPT series of language models come in various model sizes. Kindly refer to this Huggingface Hub [link](https://huggingface.co/models?search=mayaph/finopt) to see the other checkpoints of FinOPT.
|
||||
|
||||
| Model Name | Parameter Size |
|
||||
|---------------------|----------------|
|
||||
| [FinOPT-Franklin](https://huggingface.co/MayaPH/FinOPT-Franklin) | 1.3B |
|
||||
| <b>FinOPT-Lincoln</b> | <b>350M</b> |
|
||||
| [FinOPT-Washington](https://huggingface.co/MayaPH/FinOPT-Washington) | 125M |
|
||||
|
||||
## Intended Use
|
||||
FinOPT-Lincoln is designed to assist users in obtaining relevant and reliable information about financial topics. It can be used as a tool for performing question-answering tasks in the financial domain, including banking queries, investment advice, and general financial inquiries.
|
||||
|
||||
The model is intended to be used by individuals seeking information about financial topics, as well as developers and researchers working on natural language processing (NLP) tasks in the financial domain.
|
||||
|
||||
## Usage
|
||||
To use FinOPT-Lincoln, you are required to provide attribution in accordance with the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. Please include the following attribution notice when utilizing FinOPT-Lincoln in your work:
|
||||
|
||||
```python
|
||||
# This code uses FinOPT-Lincoln, a language model developed by MayaPH.
|
||||
# The model is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.
|
||||
# For more information, visit: https://creativecommons.org/licenses/by-sa/4.0/
|
||||
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained("MayaPH/FinOPT-Lincoln")
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained("MayaPH/FinOPT-Lincoln")
|
||||
```
|
||||
|
||||
Please ensure that you include the relevant attribution notice in your code or any other form of usage to comply with the license terms.
|
||||
|
||||
## Limitations and Caveats
|
||||
While FinOPT-Lincoln has been fine-tuned on a financial question-answering dataset, it is important to note the following limitations and caveats:
|
||||
|
||||
1. **Domain-Specific Focus:** The model's training data primarily consists of financial questions and answers from the financial QA dataset. It may not perform as well on questions outside the financial domain.
|
||||
|
||||
2. **Potential Bias:** The model may reflect biases present in the training data. It is crucial to carefully evaluate and interpret the model's responses, particularly on sensitive topics such as investment advice or financial recommendations.
|
||||
|
||||
3. **Confidence and Verification:** The model generates responses based on patterns learned from the training data, but it does not have inherent fact-checking capabilities. Users should verify the information provided by the model from reliable sources before making any financial decisions.
|
||||
|
||||
## Training Data
|
||||
FinOPT-Lincoln was trained on a financial question-answering dataset, which consists of questions and answers related to various financial topics. The dataset was collected from online sources and financial forums, and manually handcrafted.
|
||||
|
||||
## Ethical Considerations
|
||||
When using FinOPT-Lincoln, it is important to consider the following ethical considerations:
|
||||
|
||||
1. **Privacy and Security:** Avoid sharing sensitive personal or financial information while interacting with the model. The model does not have privacy safeguards, so exercise caution when discussing personal or confidential matters.
|
||||
|
||||
2. **Fairness and Bias:** The model's responses may reflect biases present in the training data. Be aware of potential biases and make an effort to evaluate responses critically and fairly.
|
||||
|
||||
3. **Transparency:** The model operates as a predictive text generator based on patterns learned from the training data. The model's inner workings and the specific training data used are proprietary and not publicly available.
|
||||
|
||||
4. **User Responsibility:** Users should take responsibility
|
||||
|
||||
for their own financial decisions and not solely rely on the information provided by the model. Consult with financial professionals or reliable sources for specific financial advice or recommendations.
|
||||
|
||||
## Further Information
|
||||
For additional information or inquiries about FinOPT-Lincoln, please contact the Maya Philippines iOps Team via jasper.catapang@maya.ph.
|
||||
|
||||
## Disclaimer
|
||||
FinOPT-Lincoln is an AI language model trained by Maya Philippines. It is provided "as is" without warranty of any kind, express or implied. The model developers and Maya Philippines shall not be liable for any direct or indirect damages arising from the use of this model.
|
||||
|
||||
## Acknowledgments
|
||||
The development of FinOPT-Lincoln was made possible by Maya Philippines and the curation and creation of the financial question-answering dataset.
|
||||
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
||||
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MayaPH__FinOPT-Lincoln)
|
||||
|
||||
| Metric | Value |
|
||||
|-----------------------|---------------------------|
|
||||
| Avg. | 25.2 |
|
||||
| ARC (25-shot) | 26.71 |
|
||||
| HellaSwag (10-shot) | 25.6 |
|
||||
| MMLU (5-shot) | 23.0 |
|
||||
| TruthfulQA (0-shot) | 50.59 |
|
||||
| Winogrande (5-shot) | 49.72 |
|
||||
| GSM8K (5-shot) | 0.0 |
|
||||
| DROP (3-shot) | 0.76 |
|
||||
31
config.json
Normal file
31
config.json
Normal file
@@ -0,0 +1,31 @@
|
||||
{
|
||||
"_name_or_path": "FinOPT-Lincoln",
|
||||
"_remove_final_layer_norm": false,
|
||||
"activation_dropout": 0.0,
|
||||
"activation_function": "relu",
|
||||
"architectures": [
|
||||
"OPTForCausalLM"
|
||||
],
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 2,
|
||||
"do_layer_norm_before": false,
|
||||
"dropout": 0.1,
|
||||
"enable_bias": true,
|
||||
"eos_token_id": 2,
|
||||
"ffn_dim": 4096,
|
||||
"hidden_size": 1024,
|
||||
"init_std": 0.02,
|
||||
"layer_norm_elementwise_affine": true,
|
||||
"layerdrop": 0.0,
|
||||
"max_position_embeddings": 2048,
|
||||
"model_type": "opt",
|
||||
"num_attention_heads": 16,
|
||||
"num_hidden_layers": 24,
|
||||
"pad_token_id": 1,
|
||||
"prefix": "</s>",
|
||||
"torch_dtype": "float32",
|
||||
"transformers_version": "4.29.2",
|
||||
"use_cache": false,
|
||||
"vocab_size": 50272,
|
||||
"word_embed_proj_dim": 512
|
||||
}
|
||||
7
generation_config.json
Normal file
7
generation_config.json
Normal file
@@ -0,0 +1,7 @@
|
||||
{
|
||||
"_from_model_config": true,
|
||||
"bos_token_id": 2,
|
||||
"eos_token_id": 2,
|
||||
"pad_token_id": 1,
|
||||
"transformers_version": "4.29.2"
|
||||
}
|
||||
50001
merges.txt
Normal file
50001
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
model.safetensors
Normal file
3
model.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7268012e3ae84bd01310ab14b698101b871f08a60283e7a75529ec1f27364416
|
||||
size 1324830880
|
||||
3
pytorch_model.bin
Normal file
3
pytorch_model.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d99d032fd2ec9e7469b3c7ac63c813ee2e009e5ab838a969f8b45d2a507fed34
|
||||
size 1324917277
|
||||
30
special_tokens_map.json
Normal file
30
special_tokens_map.json
Normal file
@@ -0,0 +1,30 @@
|
||||
{
|
||||
"bos_token": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"eos_token": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": {
|
||||
"content": "<pad>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"unk_token": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
100364
tokenizer.json
Normal file
100364
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
40
tokenizer_config.json
Normal file
40
tokenizer_config.json
Normal file
@@ -0,0 +1,40 @@
|
||||
{
|
||||
"add_bos_token": true,
|
||||
"add_prefix_space": false,
|
||||
"bos_token": {
|
||||
"__type": "AddedToken",
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"clean_up_tokenization_spaces": true,
|
||||
"eos_token": {
|
||||
"__type": "AddedToken",
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"errors": "replace",
|
||||
"model_max_length": 1000000000000000019884624838656,
|
||||
"pad_token": {
|
||||
"__type": "AddedToken",
|
||||
"content": "<pad>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"tokenizer_class": "GPT2Tokenizer",
|
||||
"unk_token": {
|
||||
"__type": "AddedToken",
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": true,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user