初始化项目,由ModelHub XC社区提供模型
Model: ReBatch/Llama-3-8B-dutch Source: Original Platform
This commit is contained in:
70
README.md
Normal file
70
README.md
Normal file
@@ -0,0 +1,70 @@
|
||||
---
|
||||
license: llama3
|
||||
base_model: meta-llama/Meta-Llama-3-8B
|
||||
tags:
|
||||
- ORPO
|
||||
- llama 3 8B
|
||||
- conversational
|
||||
datasets:
|
||||
- BramVanroy/ultra_feedback_dutch
|
||||
model-index:
|
||||
- name: ReBatch/Llama-3-8B-dutch
|
||||
results: []
|
||||
language:
|
||||
- nl
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
<p align="center" style="margin:0;padding:0">
|
||||
<img src="llama3-8b-dutch-banner.jpeg" alt="Llama 3 dutch banner" width="400" height="400"/>
|
||||
</p>
|
||||
|
||||
<div style="margin:auto; text-align:center">
|
||||
<h1 style="margin-bottom: 0">Llama 3 8B - Dutch</h1>
|
||||
<em>A conversational model for Dutch, based on Llama 3 8B</em>
|
||||
<p><em><a href="https://huggingface.co/spaces/ReBatch/Llama-3-Dutch">Try chatting with the model!</a></em></p>
|
||||
</div>
|
||||
|
||||
This model is a [QLORA](https://huggingface.co/blog/4bit-transformers-bitsandbytes) and [ORPO](https://huggingface.co/docs/trl/main/en/orpo_trainer) fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the synthetic feedback dataset [BramVanroy/ultra_feedback_dutch](https://huggingface.co/datasets/BramVanroy/ultra_feedback_dutch)
|
||||
|
||||
|
||||
## Model description
|
||||
This model is a Dutch chat model, originally developed from Llama 3 8B and further refined through a feedback dataset with [ORPO](https://huggingface.co/docs/trl/main/en/orpo_trainer) and trained on [BramVanroy/ultra_feedback_dutch](https://huggingface.co/datasets/BramVanroy/ultra_feedback_dutch)
|
||||
|
||||
|
||||
|
||||
## Intended uses & limitations
|
||||
Although the model has been aligned with gpt-4-turbo output, which has strong content filters, the model could still generate wrong, misleading, and potentially even offensive content. Use at your own risk.
|
||||
|
||||
|
||||
## Training procedure
|
||||
|
||||
The model was trained in bfloat16 with QLORA with flash attention 2 on one GPU - H100 80GB SXM5 for around 24 hours on RunPod.
|
||||
|
||||
## Evaluation Results
|
||||
|
||||
The model was evaluated using [scandeval](https://scandeval.com/dutch-nlg/)
|
||||
|
||||
The model showed mixed results across different benchmarks; it exhibited slight improvements on some while experiencing a decrease in scores on others. This occurred despite being trained on only 200,000 samples for a single epoch. We are curious to see whether its performance could be enhanced by training with more data or additional epochs.
|
||||
|
||||
| Model| conll_nl | dutch_social | scala_nl | squad_nl | wiki_lingua_nl | mmlu_nl | hellaswag_nl |
|
||||
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:
|
||||
meta-llama/Meta-Llama-3-8B-Instruct | 68.72 | 14.67 | 32.91 | 45.36 | 67.62 | 36.18 | 33.91
|
||||
ReBatch/Llama-3-8B-dutch | 58.85 | 11.14 | 15.58 | 59.96 | 64.51 | 36.27 | 28.34
|
||||
meta-llama/Meta-Llama-3-8B | 62.26 | 10.45| 30.3| 62.99| 65.17 | 36.38| 28.33
|
||||
|
||||
### Training hyperparameters
|
||||
|
||||
The following hyperparameters were used during training:
|
||||
- learning_rate: 8e-06
|
||||
- train_batch_size: 2
|
||||
- eval_batch_size: 2
|
||||
- num_devices: 1
|
||||
- gradient_accumulation_steps: 4
|
||||
- optimizer: paged_adamw_8bit
|
||||
- lr_scheduler_type: linear
|
||||
- warmup_steps: 10
|
||||
- num_epochs: 1.0
|
||||
- r: 16
|
||||
- lora_alpha: 32
|
||||
- lora_dropout: 0.05
|
||||
Reference in New Issue
Block a user