初始化项目，由ModelHub XC社区提供模型

Model: FINGU-AI/FinguAI-Chat-v1 Source: Original Platform
2026-05-08 14:26:09 +08:00
commit cc1e1cedd5
10 changed files with 303164 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,35 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,87 @@
 ---
 license: apache-2.0
 language:
 - en
 - ko
 - ja
 library_name: transformers
 tags:
 - finance
 ---
 ## FINGU-AI/FinguAI-Chat-v1
 ### Overview
 The FINGU-AI/FinguAI-Chat-v1 model offers a specialized curriculum tailored to English, Korean, and Japanese speakers interested in finance, investment, and legal frameworks. It aims to enhance language proficiency while providing insights into global finance markets and regulatory landscapes.
 ### Key Features
 - **Global Perspective**: Explores diverse financial markets and regulations across English, Korean, and Japanese contexts.
 - **Language Proficiency**: Enhances language skills in English, Korean, and Japanese for effective communication in finance and legal domains.
 - **Career Advancement**: Equips learners with knowledge and skills for roles in investment banking, corporate finance, asset management, and regulatory compliance.
 ### Model Information
 - **Model Name**: FINGU-AI/FinguAI-Chat-v1
 - **Description**: FINGU-AI/FinguAI-Chat-v1 model trained on various languages, including English, Korean, and Japanese.
 - **Checkpoint**: FINGU-AI/FinguAI-Chat-v1
 - **Author**: Grinda AI Inc.
 - **License**: Apache-2.0
 ### Training Details
 - **Fine-Tuning**: The model was fine-tuned on the base model Qwen/Qwen1.5-0.5B-Chat through supervised fine-tuning using the TrL Library and Transformer.
 - **Dataset**: The fine-tuning dataset consisted of 9042 training samples, with 3000 samples each in Korean, English, and Japanese languages.
 ### How to Use
 To use the FINGU-AI/FinguAI-Chat-v1 model, you can utilize the Hugging Face Transformers library. Here's a Python code snippet demonstrating how to load the model and generate predictions:
 ```python
 #!pip install 'transformers>=4.39.0'
 #!pip install -U flash-attn
 #!pip install -q -U git+https://github.com/huggingface/accelerate.
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig,TextStreamer
 model_id = 'FINGU-AI/FinguAI-Chat-v1'
 model = AutoModelForCausalLM.from_pretrained(model_id, attn_implementation="flash_attention_2", torch_dtype= torch.bfloat16)
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 streamer = TextStreamer(tokenizer)
 model.to('cuda')
 messages = [
    {"role": "system","content": " you are as a finance specialist, help the user and provide accurat information."},
    {"role": "user", "content": " what are the best approch to prevent loss?"},
 ]
 tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
 generation_params = {
    'max_new_tokens': 1000,
    'use_cache': True,
    'do_sample': True,
    'temperature': 0.7,
    'top_p': 0.9,
    'top_k': 50,
    'eos_token_id': tokenizer.eos_token_id,
 }
 outputs = model.generate(tokenized_chat, **generation_params, streamer=streamer)
 decoded_outputs = tokenizer.batch_decode(outputs)
 '''
 To avoid losses, it's essential to maintain discipline, set realistic goals, and adhere to predetermined rules for trading.
 Diversification is key as it spreads investments across different sectors and asset classes to reduce overall risk.
 Regularly reviewing and rebalancing positions can also ensure alignment with investment objectives. Additionally,
 staying informed about market trends and economic indicators can provide opportunities for long-term capital preservation.
 It's also important to stay patient and avoid emotional decision-making, as emotions often cloud judgment.
 If you encounter significant losses, consider using stop-loss orders to limit your losses.
 Staying disciplined and focusing on long-term objectives can help protect your investment portfolio from permanent damage.
 '''
 ```
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,5 @@
 {
  "<|endoftext|>": 151643,
  "<|im_end|>": 151645,
  "<|im_start|>": 151644
 }
--- a/config.json
+++ b/config.json
@@ -0,0 +1,28 @@
 {
  "_name_or_path": "Qwen/Qwen1.5-0.5B-Chat",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 2816,
  "max_position_embeddings": 32768,
  "max_window_layers": 21,
  "model_type": "qwen2",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "num_key_value_heads": 16,
  "rms_norm_eps": 1e-06,
  "rope_theta": 1000000.0,
  "sliding_window": 32768,
  "tie_word_embeddings": true,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.39.0",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 151936
 }
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,12 @@
 {
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.1,
  "top_p": 0.8,
  "transformers_version": "4.39.0"
 }
--- a/merges.txt
+++ b/merges.txt
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:2bf48624d592e8206804d93524a78908d4262519c1a8bd4e1b753375b4ab0a2f
 size 928008104
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,14 @@
 {
  "additional_special_tokens": [
    "<|im_start|>",
    "<|im_end|>"
  ],
  "eos_token": {
    "content": "<|im_end|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": "<|endoftext|>"
 }
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,43 @@
 {
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "151643": {
      "content": "<|endoftext|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151644": {
      "content": "<|im_start|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "151645": {
      "content": "<|im_end|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    }
  },
  "additional_special_tokens": [
    "<|im_start|>",
    "<|im_end|>"
  ],
  "bos_token": null,
  "chat_template": "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content']}}{% if (loop.last and add_generation_prompt) or not loop.last %}{{ '<|im_end|>' + '\n'}}{% endif %}{% endfor %}{% if add_generation_prompt and messages[-1]['role'] != 'assistant' %}{{ '<|im_start|>assistant\n' }}{% endif %}",
  "clean_up_tokenization_spaces": false,
  "eos_token": "<|im_end|>",
  "errors": "replace",
  "model_max_length": 32768,
  "pad_token": "<|endoftext|>",
  "split_special_tokens": false,
  "tokenizer_class": "Qwen2Tokenizer",
  "unk_token": null
 }
--- a/vocab.json
+++ b/vocab.json