初始化项目，由ModelHub XC社区提供模型

Model: artificialguybr/Meta-Llama-3.1-8B-openhermes-2.5 Source: Original Platform
2026-06-03 02:01:21 +08:00
commit d980854fb7
17 changed files with 494079 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,115 @@
+---
+tags:
+  - llama
+  - instruct
+  - finetune
+  - chatml
+  - gpt4
+  - synthetic data
+  - distillation
+model-index:
+  - name: Meta-Llama-3.1-8B-openhermes-2.5
+    results: []
+license: apache-2.0
+language:
+  - en
+library_name: transformers
+datasets:
+- teknium/OpenHermes-2.5
+---
+
+# Model Card for Meta-Llama-3.1-8B-openhermes-2.5
+
+This model is a fine-tuned version of Meta-Llama-3.1-8B on the OpenHermes-2.5 dataset.
+
+## Model Details
+
+### Model Description
+
+This is a fine-tuned version of the Meta-Llama-3.1-8B model, trained on the OpenHermes-2.5 dataset. It is designed for instruction following and general language tasks.
+
+- **Developed by:** artificialguybr
+- **Model type:** Causal Language Model
+- **Language(s):** English
+- **License:** apache-2.0
+- **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B
+
+---
+### 🌐 Website
+You can find more of my models, projects, and information on my official website:
+- **[artificialguy.com](https://artificialguy.com/)**
+
+
+### 🚀 Prompt Hub
+Need high-quality prompts for image models and LLMs? Explore **[findgoodprompt.com](https://findgoodprompt.com)**.
+### 💖 Support My Work
+If you find this model useful, please consider supporting my work. It helps me cover server costs and dedicate more time to new open-source projects.
+- **Patreon:** [Support on Patreon](https://www.patreon.com/user?u=81570187)
+- **Ko-fi:** [Buy me a Ko-fi](https://ko-fi.com/artificialguybr)
+- **Buy Me a Coffee:** [Buy me a Coffee](https://buymeacoffee.com/jvkape)
+### Model Sources
+
+- **Repository:** https://huggingface.co/artificialguybr/Meta-Llama-3.1-8B-openhermes-2.5
+
+## Uses
+
+This model can be used for various natural language processing tasks, particularly those involving instruction following and general language understanding.
+
+### Direct Use
+
+The model can be used for tasks such as text generation, question answering, and other language-related applications.
+
+### Out-of-Scope Use
+
+The model should not be used for generating harmful or biased content. Users should be aware of potential biases in the training data.
+
+## Training Details
+
+### Training Data
+
+The model was fine-tuned on the teknium/OpenHermes-2.5 dataset.
+
+### Training Procedure
+
+#### Training Hyperparameters
+
+- **Training regime:** BF16 mixed precision
+- **Optimizer:** AdamW
+- **Learning rate:** Started at 0.00000249316296439037 (decaying)
+- **Batch size:** Not specified (gradient accumulation steps: 8)
+- **Training steps:** 13,368
+- **Evaluation strategy:** Steps (every 0.16666666666666666 steps)
+- **Gradient checkpointing:** Enabled
+- **Weight decay:** 0
+
+#### Hardware and Software
+
+- **Hardware:** NVIDIA A100-SXM4-80GB (1 GPU)
+- **Software Framework:** 🤗 Transformers, Axolotl
+
+## Evaluation
+
+### Metrics
+
+- **Loss:** 0.6727465987205505 (evaluation)
+- **Perplexity:** Not provided
+
+### Results
+
+- **Evaluation runtime:** 2,676.4173 seconds
+- **Samples per second:** 18.711
+- **Steps per second:** 18.711
+
+## Model Architecture
+
+- **Model Type:** LlamaForCausalLM
+- **Hidden size:** 4,096
+- **Intermediate size:** 14,336
+- **Number of attention heads:** Not specified
+- **Number of layers:** Not specified
+- **Activation function:** SiLU
+- **Vocabulary size:** 128,256
+
+## Limitations and Biases
+
+More information is needed about specific limitations and biases of this model.