初始化项目，由ModelHub XC社区提供模型

Model: ReBatch/Llama-3-8B-dutch Source: Original Platform
2026-05-15 09:52:51 +08:00
commit a12b1169ed
13 changed files with 413076 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,70 @@
+---
+license: llama3
+base_model: meta-llama/Meta-Llama-3-8B
+tags:
+- ORPO
+- llama 3 8B
+- conversational
+datasets:
+- BramVanroy/ultra_feedback_dutch
+model-index:
+- name: ReBatch/Llama-3-8B-dutch
+  results: []
+language:
+- nl
+pipeline_tag: text-generation
+---
+
+<p align="center" style="margin:0;padding:0">
+  <img src="llama3-8b-dutch-banner.jpeg" alt="Llama 3 dutch banner" width="400" height="400"/>
+</p>
+
+<div style="margin:auto; text-align:center">
+<h1 style="margin-bottom: 0">Llama 3 8B - Dutch</h1>
+<em>A conversational model for Dutch, based on Llama 3 8B</em>
+<p><em><a href="https://huggingface.co/spaces/ReBatch/Llama-3-Dutch">Try chatting with the model!</a></em></p>
+</div>
+
+This model is a [QLORA](https://huggingface.co/blog/4bit-transformers-bitsandbytes) and [ORPO](https://huggingface.co/docs/trl/main/en/orpo_trainer) fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the synthetic feedback dataset [BramVanroy/ultra_feedback_dutch](https://huggingface.co/datasets/BramVanroy/ultra_feedback_dutch)
+
+
+## Model description
+This model is a Dutch chat model, originally developed from Llama 3 8B and further refined through a feedback dataset with [ORPO](https://huggingface.co/docs/trl/main/en/orpo_trainer) and trained on [BramVanroy/ultra_feedback_dutch](https://huggingface.co/datasets/BramVanroy/ultra_feedback_dutch)
+
+
+
+## Intended uses & limitations
+Although the model has been aligned with gpt-4-turbo output, which has strong content filters, the model could still generate wrong, misleading, and potentially even offensive content. Use at your own risk. 
+
+
+## Training procedure
+
+The model was trained in bfloat16 with QLORA with flash attention 2 on one GPU - H100 80GB SXM5 for around 24 hours on RunPod. 
+
+## Evaluation Results
+
+The model was evaluated using [scandeval](https://scandeval.com/dutch-nlg/)
+
+The model showed mixed results across different benchmarks; it exhibited slight improvements on some while experiencing a decrease in scores on others. This occurred despite being trained on only 200,000 samples for a single epoch. We are curious to see whether its performance could be enhanced by training with more data or additional epochs.
+
+| Model| conll_nl | dutch_social | scala_nl | squad_nl | wiki_lingua_nl | mmlu_nl | hellaswag_nl |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:
+meta-llama/Meta-Llama-3-8B-Instruct | 68.72	| 14.67	| 32.91	| 45.36	| 67.62	| 36.18	| 33.91
+ReBatch/Llama-3-8B-dutch | 58.85 | 11.14 | 15.58 | 59.96 | 64.51 | 36.27 | 28.34
+meta-llama/Meta-Llama-3-8B | 62.26 | 10.45| 30.3| 62.99| 65.17 | 36.38| 28.33
+
+### Training hyperparameters
+
+The following hyperparameters were used during training:
+- learning_rate: 8e-06
+- train_batch_size: 2
+- eval_batch_size: 2
+- num_devices: 1
+- gradient_accumulation_steps: 4
+- optimizer: paged_adamw_8bit
+- lr_scheduler_type: linear
+- warmup_steps: 10
+- num_epochs: 1.0
+- r: 16
+- lora_alpha: 32
+- lora_dropout: 0.05