初始化项目,由ModelHub XC社区提供模型

Model: tarun7r/Finance-Llama-8B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-10 22:25:55 +08:00
commit 1053c5178f
13 changed files with 2855 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

324
README.md Normal file
View File

@@ -0,0 +1,324 @@
---
license: apache-2.0
tags:
- text-generation-inference
- finance
- economics
datasets:
- Josephgflowers/Finance-Instruct-500k
language:
- en
base_model:
- unsloth/Meta-Llama-3.1-8B
pipeline_tag: text-generation
library_name: transformers
---
# Model Card for Finance-Llama-8B
This model is a fine-tuned version of `unsloth/Meta-Llama-3.1-8B` on the `Josephgflowers/Finance-Instruct-500k` dataset. It's designed for financial tasks, reasoning, and multi-turn conversations.
## Key Features
* **Extensive Coverage:** Trained on over 500,000 entries spanning financial QA, reasoning, sentiment analysis, topic classification, multilingual NER, and conversational AI.📚
* **Multi-Turn Conversations:** Capable of rich dialogues emphasizing contextual understanding and reasoning.
* **Diverse Data Sources:** Includes entries from Cinder, Sujet-Finance-Instruct-177k, Phinance Dataset, BAAI/IndustryInstruction_Finance-Economics, Josephgflowers/Financial-NER-NLP, and many other high-quality datasets.
* **Financial Specialization:** Tailored for financial reasoning, question answering, entity recognition, sentiment analysis, and more.
## Dataset Details 💾
### Finance-Instruct-500k Dataset
**Overview**
Finance-Instruct-500k is a comprehensive and meticulously curated dataset designed to train advanced language models for financial tasks, reasoning, and multi-turn conversations. Combining data from numerous high-quality financial datasets, this corpus provides over 500,000 entries, offering unparalleled depth and versatility for finance-related instruction tuning and fine-tuning.
The dataset includes content tailored for financial reasoning, question answering, entity recognition, sentiment analysis, address parsing, and multilingual natural language processing (NLP). Its diverse and deduplicated entries make it suitable for a wide range of financial AI applications, including domain-specific assistants, conversational agents, and information extraction systems.
**Key Features of the Dataset**
* **Extensive Coverage:** Over 500,000 entries spanning financial QA, reasoning, sentiment analysis, topic classification, multilingual NER, and conversational AI.🌍
* **Multi-Turn Conversations:** Rich dialogues emphasizing contextual understanding and reasoning.🗣️
* **Diverse Data Sources:** Includes entries from Cinder, Sujet-Finance-Instruct-177k, Phinance Dataset, BAAI/IndustryInstruction_Finance-Economics, Josephgflowers/Financial-NER-NLP, and many other high-quality datasets. 📖
## CFA Level 1 Mock Exam Results
The CFA (Chartered Financial Analyst) exam is widely recognized as one of the most challenging professional certifications in the financial industry, typically requiring over 1000 hours of study across all three levels. The evaluation concept for the CFA Level 1 mock exam was inspired by the work on [mukaj/Llama-3.1-Hawkish-8B](https://huggingface.co/mukaj/Llama-3.1-Hawkish-8B). Below is a comparison of different models on a sample Level 1 CFA Mock Exam, demonstrating how Finance-Llama-8B performs on the exam. The same prompt was used for all models. The results presented are approximated and have been tested across multiple mock exam papers to ensure consistency. A sample mock exam with a comparison to other models is shown below.
<table>
<thead>
<tr>
<th>CFA Level 1</th>
<th>GPT-4o-mini (%)</th>
<th>Finance-Llama-8B (%)</th>
<th>Meta-Llama Instruct 8B (%)</th>
<th>Meta-Llama Instruct 70B (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Ethical and Professional Standards</td>
<td>80</td>
<td>76</td>
<td>56</td>
<td>68</td>
</tr>
<tr>
<td>Quantitative Methods</td>
<td>74</td>
<td>73</td>
<td>64</td>
<td>85</td>
</tr>
<tr>
<td>Economics</td>
<td>69</td>
<td>74</td>
<td>59</td>
<td>59</td>
</tr>
<tr>
<td>Financial Reporting</td>
<td>81</td>
<td>77</td>
<td>67</td>
<td>71</td>
</tr>
<tr>
<td>Corporate Finance</td>
<td>82</td>
<td>71</td>
<td>51</td>
<td>80</td>
</tr>
<tr>
<td>Equity Investments</td>
<td>53</td>
<td>67</td>
<td>43</td>
<td>66</td>
</tr>
<tr>
<td>Fixed Income</td>
<td>80</td>
<td>72</td>
<td>29</td>
<td>51</td>
</tr>
<tr>
<td>Derivatives</td>
<td>54</td>
<td>72</td>
<td>34</td>
<td>35</td>
</tr>
<tr>
<td>Alternative Investments</td>
<td>100</td>
<td>89</td>
<td>74</td>
<td>100</td>
</tr>
<tr>
<td>Portfolio Management</td>
<td>85</td>
<td>75</td>
<td>52</td>
<td>100</td>
</tr>
<tr>
<td><b>Weighted Average</b></td>
<td><b>75</td>
<td><b>73</td>
<td>53</td>
<td><b>70</td>
</tr>
<tr>
<td><b>Result</b></td>
<td><b>PASS</b></td>
<td><b>PASS</b></td>
<td><b>FAIL</b></td>
<td><b>PASS</b></td>
</tr>
</tbody>
</table>
The mock exams are designed to challenge candidates with varying levels of difficulty, reflecting the rigorous nature of the CFA Level 1 exam. Pass rates for these mock exams typically range from 64% to 72%, with an average pass rate of around 67%. This average is notably higher than the 12-year average Minimum Passing Score (MPS) of 65% for all CFA years, indicating the effectiveness of the preparation materials. For more detailed insights, visit [300hours.com/cfa-passing-score](https://300hours.com/cfa-passing-score/).
## Usage
### Ollama
You can also use this model with Ollama. Pre-built GGUF versions (FP16 and Q4_K_M) are available at:
[ollama.com/martain7r/finance-llama-8b](https://ollama.com/martain7r/finance-llama-8b)
To run the FP16 version:
```bash
ollama run martain7r/finance-llama-8b:fp16
```
To run the Q4_K_M quantized version (smaller and faster, with a slight trade-off in quality):
```bash
ollama run martain7r/finance-llama-8b:q4_k_m
```
This model can be used with the `transformers` library pipeline for text generation.
First, make sure you have the `transformers` and `torch` libraries installed:
````bash
pip install transformers torch
````
**Usage 🚀**
**Transformers Pipeline**
````bash
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
import torch
# Alternative memory-efficient loading options without bitsandbytes
model_id = "tarun7r/Finance-Llama-8B"
print("Loading model with memory optimizations...")
# Option 1: Use FP16 (half precision) - reduces memory by ~50%
try:
print("Trying FP16 loading...")
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16, # Half precision
device_map="auto", # Automatic device placement
low_cpu_mem_usage=True, # Efficient CPU memory usage during loading
trust_remote_code=True
)
print("✓ Model loaded with FP16")
except Exception as e:
print(f"FP16 loading failed: {e}")
# Option 2: CPU offloading - some layers on GPU, some on CPU
try:
print("Trying CPU offloading...")
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="balanced", # Balance between GPU and CPU
low_cpu_mem_usage=True,
trust_remote_code=True
)
print("✓ Model loaded with CPU offloading")
except Exception as e:
print(f"CPU offloading failed: {e}")
# Option 3: Full CPU loading as fallback
print("Loading on CPU...")
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="cpu",
low_cpu_mem_usage=True,
trust_remote_code=True
)
print("✓ Model loaded on CPU")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# Create pipeline
generator = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer
)
print("✓ Pipeline created successfully!")
# Your existing prompt code
finance_prompt_template = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:
"""
# Update the system prompt to provide a more detailed description of the chatbot's role
messages = [
{"role": "system", "content": "You are a highly knowledgeable finance chatbot. Your purpose is to provide accurate, insightful, and actionable financial advice to users, tailored to their specific needs and contexts."},
{"role": "user", "content": "What strategies can an individual investor use to diversify their portfolio effectively in a volatile market?"},
]
# Update the generator call to use the messages
prompt = "\n".join([f"{msg['role'].capitalize()}: {msg['content']}" for msg in messages])
print("\n--- Generating Response ---")
try:
outputs = generator(
prompt,
#max_new_tokens=250, # Reduced for memory efficiency
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
# Memory efficient generation settings
num_beams=1, # No beam search to save memory
early_stopping=True,
use_cache=True
)
# Extract response
generated_text = outputs[0]['generated_text']
response_start = generated_text.rfind("### Response:")
if response_start != -1:
response = generated_text[response_start + len("### Response:"):].strip()
print("\n--- Response ---")
print(response)
else:
print(generated_text)
# Clean up GPU memory after generation
if torch.cuda.is_available():
torch.cuda.empty_cache()
except Exception as e:
print(f"Generation error: {e}")
````
**Citation 📌**
````
@misc{tarun7r/Finance-Llama-8B,
author = {tarun7r},
title = {tarun7r/Finance-Llama-8B: A Llama 3.1 8B Model Fine-tuned on Josephgflowers/Finance-Instruct-500k},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {\url{https://huggingface.co/tarun7r/Finance-Llama-8B}}
}
````
## Disclaimer & Intended Uses
### Model & License
This model is an experimental research implementation based on Meta's LLaMA 3.1 architecture and is governed by the LLaMA 3.1 community license terms, with additional restrictions as outlined below. It is designed for academic and research purposes to explore the influence of financial data in training language models. Users are advised that this model is experimental and should be used at their own risk, with full responsibility for any implementation or application. **This model is not a financial advisor and should not be used for financial decision-making.**
### Liability & Responsibility
The creators of this model:
- Accept no responsibility for any use of the model, including any financial losses or damages incurred.
- Provide no warranties or guarantees regarding its performance, accuracy, or reliability.
- Make no claims about the suitability of the model for any specific purpose.
### Intellectual Property & Attribution
- All findings and opinions expressed are solely those of the authors.
- This model is not endorsed by or affiliated with Meta, the CFA Institute, or any other institutions.
- All trademarks and intellectual property rights belong to their respective owners.

38
config.json Normal file
View File

@@ -0,0 +1,38 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": 128001,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pad_token_id": 128004,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 8.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.51.3",
"unsloth_fixed": true,
"unsloth_version": "2025.5.7",
"use_cache": true,
"vocab_size": 128256
}

11
generation_config.json Normal file
View File

@@ -0,0 +1,11 @@
{
"_from_model_config": true,
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": 128001,
"max_length": 131072,
"pad_token_id": 128004,
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.51.3"
}

View File

@@ -0,0 +1,298 @@
{
"metadata": {
"total_size": 16060522496
},
"weight_map": {
"model.embed_tokens.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00004.safetensors",
"model.layers.10.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.10.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.11.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.12.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.13.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.14.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.15.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.16.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.17.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.18.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.19.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.20.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.20.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.20.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.input_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.mlp.down_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.mlp.gate_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.mlp.up_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.post_attention_layernorm.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.self_attn.k_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.self_attn.o_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.self_attn.q_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.9.self_attn.v_proj.weight": "pytorch_model-00002-of-00004.safetensors",
"model.layers.20.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.20.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.20.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.20.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.21.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.22.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.23.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.24.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.25.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.26.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.27.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.28.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.29.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.input_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.mlp.down_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.post_attention_layernorm.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.30.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.31.mlp.gate_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.31.mlp.up_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.31.self_attn.o_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"model.layers.31.self_attn.v_proj.weight": "pytorch_model-00003-of-00004.safetensors",
"lm_head.weight": "pytorch_model-00004-of-00004.safetensors",
"model.layers.31.input_layernorm.weight": "pytorch_model-00004-of-00004.safetensors",
"model.layers.31.mlp.down_proj.weight": "pytorch_model-00004-of-00004.safetensors",
"model.layers.31.post_attention_layernorm.weight": "pytorch_model-00004-of-00004.safetensors",
"model.norm.weight": "pytorch_model-00004-of-00004.safetensors"
}
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:dd958bceca742b63988e37e4a5d3d1a021ea45836ba893ae1ca1108942d31422
size 4976698592

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:45f4aeb19a1b9bc93161a0c887e5793c7528d29d5744fd6e7a3bdbdf8a6b1692
size 4999802616

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:89cb5929bf924cd3936409a97a0e1071eace8b4976a7355cfe821068eea8d966
size 4915916080

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b665ba94a043aaa20672dd38661974ae23af59db44e4c306fd0e0fce8d44557c
size 1168138808

44
safe.py Normal file
View File

@@ -0,0 +1,44 @@
import json
from safetensors import safe_open
def generate_safetensors_index(model_path="."):
"""Generate model.safetensors.index.json from existing safetensors files"""
# Load the existing bin index as reference
with open(f"pytorch_model.bin.index.json", "r") as f:
bin_index = json.load(f)
# Initialize the safetensors index structure
safetensors_index = {
"metadata": bin_index.get("metadata", {}),
"weight_map": {}
}
# Map each safetensors file and get its tensor names
safetensors_files = [
"pytorch_model-00001-of-00004.safetensors",
"pytorch_model-00002-of-00004.safetensors",
"pytorch_model-00003-of-00004.safetensors",
"pytorch_model-00004-of-00004.safetensors"
]
for safetensor_file in safetensors_files:
try:
with safe_open(f"{safetensor_file}", framework="pt") as f:
for tensor_name in f.keys():
safetensors_index["weight_map"][tensor_name] = safetensor_file
print(f"✓ Processed {safetensor_file}")
except Exception as e:
print(f"✗ Error processing {safetensor_file}: {e}")
# Save the index file
with open(f"model.safetensors.index.json", "w") as f:
json.dump(safetensors_index, f, indent=2)
print(f"✓ Generated model.safetensors.index.json with {len(safetensors_index['weight_map'])} tensors")
return safetensors_index
# Run the function
if __name__ == "__main__":
# Change this path to your model directory if needed
generate_safetensors_index("./Finance-Llama-8B")

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|finetune_right_pad_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

2066
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff