初始化项目，由ModelHub XC社区提供模型

Model: Nelis5174473/GovLLM-7B-ultra Source: Original Platform
2026-05-27 06:22:18 +08:00
commit d79a842c98
15 changed files with 91664 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,82 @@
+---
+library_name: transformers
+tags:
+- government
+- conversational
+- question-answering
+- dutch
+- geitje
+license: apache-2.0
+datasets:
+- Nelis5174473/Dutch-QA-Pairs-Rijksoverheid
+language:
+- nl
+pipeline_tag: text-generation
+---
+
+<p align="center" style="margin:0;padding:0">
+<img src="https://cdn-uploads.huggingface.co/production/uploads/65e04544f59f66e0e072dc5c/b-OsZLNJtPHMwzbgwmGlV.png" alt="GovLLM Ultra banner" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
+</p>
+
+<div style="margin:auto; text-align:center">
+<h1 style="margin-bottom: 0">GovLLM-7B-ultra</h1>
+<em>A question answering model about the Dutch Government.</em>
+</div>
+
+## Model description
+
+This model is a fine-tuned version of the Dutch conversational model [BramVanroy/GEITje-7B-ULTRA](https://huggingface.co/BramVanroy/GEITje-7B-ultra) on a [Dutch question-answer pair dataset](https://huggingface.co/datasets/Nelis5174473/Dutch-QA-Pairs-Rijksoverheid) of the Dutch Government. This is a Dutch question/answer model ultimately based on Mistral and fine-tuned with SFT and LoRA. The training with 3 epochs took almost 2 hours and was run on an Nvidia A100 (40GB VRAM).
+
+# Usage with Inference Endpoints (Dedicated)
+
+```python
+import requests
+
+API_URL = "https://your-own-endpoint.us-east-1.aws.endpoints.huggingface.cloud"
+headers = {"Authorization": "Bearer hf_your_own_token"}
+
+def query(payload):
+	response = requests.post(API_URL, headers=headers, json=payload)
+	return response.json()
+
+output = query({
+	"inputs": "Geeft de overheid subsidie aan bedrijven?"
+})
+
+# print generated answer
+print(output[0]['generated_text'])
+```
+
+## Training hyperparameters
+
+The following hyperparameters were used during training:
+- block_size: 1024,
+- model_max_length: 2048,
+- padding: right,
+- mixed_precision: fp16,
+- learning rate (lr): 0.00003,
+- epochs: 3,
+- batch_size: 2,
+- optimizer: adamw_torch,
+- schedular: linear,
+- quantization: int8,
+- peft: true,
+- lora_r: 16,
+- lora_alpha: 16,
+- lora_dropout: 0.05
+
+### Training results
+
+|  Epoch |  Loss    |  Grad_norm | learning_rate |  step    |
+|:------:|---------:|:----------:|:-------------:|:--------:|
+|  0.14  |  1.3183  |  0.6038    |  1.3888e-05   |  25/540  |
+|  0.42  |  1.0220  |  0.4180    |  2.8765e-05   |  75/540  |
+|  0.69  |  0.9251  |  0.4119    |  2.56793-05   |  125/540 |
+|  0.97  |  0.9260  |  0.4682    |  2.2592e-05   |  175/540 |
+|  1.25  |  0.8586  |  0.5338    |  1.9506e-05   |  225/540 |
+|  1.53  |  0.8767  |  0.6359    |  1.6420e-05   |  275/540 |
+|  1.80  |  0.8721  |  0.6137    |  1.3333e-05   |  325/540 |
+|  2.08  |  0.8469  |  0.7310    |  1.0247e-05   |  375/540 |
+|  2.36  |  0.8324  |  0.7945    |  7.1605e-05   |  425/540 |
+|  2.64  |  0.8170  |  0.8522    |  4.0741e-05   |  475/540 |
+|  2.91  |  0.8185  |  0.8562    |  9.8765e-05   |  525/540 |