初始化项目，由ModelHub XC社区提供模型

Model: dbmdz/german-gpt2 Source: Original Platform
2026-06-09 04:32:13 +08:00
commit e1f914d572
20 changed files with 150647 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,10 @@
+*.bin.* filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tar.gz filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+model.safetensors filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,90 @@
+---
+language: de
+
+widget:
+- text: "Heute ist sehr schönes Wetter in"
+
+license: mit
+---
+
+# German GPT-2 model
+
+In this repository we release (yet another) GPT-2 model, that was trained on various texts for German.
+
+The model is meant to be an entry point for fine-tuning on other texts, and it is definitely not as good or "dangerous" as the English GPT-3 model. We do not plan extensive PR or staged releases for this model 😉
+
+**Note**: The model was initially released under an anonymous alias (`anonymous-german-nlp/german-gpt2`) so we now "de-anonymize" it.
+
+More details about GPT-2 can be found in the great [Hugging Face](https://huggingface.co/transformers/model_doc/gpt2.html) documentation.
+
+# Changelog
+
+16.08.2021: Public release of re-trained version of our German GPT-2 model with better results.
+
+15.11.2020: Initial release. Please use the tag `v1.0` for [this older version](https://huggingface.co/dbmdz/german-gpt2/tree/v1.0).
+
+# Training corpora
+
+We use pretty much the same corpora as used for training the DBMDZ BERT model, that can be found in [this repository](https://github.com/dbmdz/berts).
+
+Thanks to the awesome Hugging Face team, it is possible to create byte-level BPE with their awesome [Tokenizers](https://github.com/huggingface/tokenizers) library.
+
+With the previously mentioned awesome Tokenizers library we created a 50K byte-level BPE vocab based on the training corpora.
+
+After creating the vocab, we could train the GPT-2 for German on a v3-8 TPU over the complete training corpus for 20 epochs. All hyperparameters
+can be found in the official JAX/FLAX documentation [here](https://github.com/huggingface/transformers/blob/master/examples/flax/language-modeling/README.md)
+from Transformers.
+
+# Using the model
+
+The model itself can be used in this way:
+
+```python
+from transformers import AutoTokenizer, AutoModelWithLMHead
+
+tokenizer = AutoTokenizer.from_pretrained("dbmdz/german-gpt2")
+
+model = AutoModelWithLMHead.from_pretrained("dbmdz/german-gpt2")
+```
+
+However, text generation is a bit more interesting, so here's an example that shows how to use the great Transformers *Pipelines* for generating text:
+
+```python
+from transformers import pipeline
+
+pipe = pipeline('text-generation', model="dbmdz/german-gpt2",
+                 tokenizer="dbmdz/german-gpt2")
+
+text = pipe("Der Sinn des Lebens ist es", max_length=100)[0]["generated_text"]
+
+print(text)
+```
+
+This could output this beautiful text:
+
+```
+Der Sinn des Lebens ist es, im Geist zu verweilen, aber nicht in der Welt zu sein, sondern ganz im Geist zu leben.
+Die Menschen beginnen, sich nicht nach der Natur und nach der Welt zu richten, sondern nach der Seele,'
+```
+
+# License
+
+All models are licensed under [MIT](LICENSE).
+
+# Huggingface model hub
+
+All models are available on the [Huggingface model hub](https://huggingface.co/dbmdz).
+
+# Contact (Bugs, Feedback, Contribution and more)
+
+For questions about our BERT models just open an issue
+[here](https://github.com/stefan-it/german-gpt/issues/new) 🤗
+
+# Acknowledgments
+
+Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC).
+Thanks for providing access to the TFRC ❤️
+
+Thanks to the generous support from the [Hugging Face](https://huggingface.co/) team,
+it is possible to download both cased and uncased models from their S3 storage 🤗
+
--- a/config.json
+++ b/config.json
@@ -0,0 +1,38 @@
+{
+  "_name_or_path": "./",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.0,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.0,
+  "eos_token_id": 50256,
+  "gradient_checkpointing": false,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "resid_pdrop": 0.0,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.10.0.dev0",
+  "use_cache": true,
+  "vocab_size": 50265
+}
--- a/configuration.json
+++ b/configuration.json
@@ -0,0 +1 @@
+{"framework": "pytorch", "task": "text-generation", "allow_remote": true}
--- a/flax_model.msgpack
+++ b/flax_model.msgpack
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e436c321647a102f787ae270289a9afe701c7149ccc95fc6569e15be76f9e30a
+size 497788696
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:161248341459a25ea9e358f31bbc767339ecc59a859424ceb0061e2402d9f7f8
+size 510384174
--- a/onnx/added_tokens.json
+++ b/onnx/added_tokens.json
@@ -0,0 +1,8 @@
+{
+  "</s>": 2,
+  "<mask>": 4,
+  "<pad>": 1,
+  "<s>": 0,
+  "<unk>": 3,
+  "<|endoftext|>": 50265
+}
--- a/onnx/config.json
+++ b/onnx/config.json
@@ -0,0 +1,39 @@
+{
+  "_name_or_path": "dbmdz/german-gpt2",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.0,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.0,
+  "eos_token_id": 50256,
+  "gradient_checkpointing": false,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.0,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "transformers_version": "4.34.0",
+  "use_cache": true,
+  "vocab_size": 50265
+}
--- a/onnx/decoder_model.onnx
+++ b/onnx/decoder_model.onnx
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bcc909a0656c80266871411ce1f0750d683e6bc9aa822ae47ffcb014cf704dc8
+size 653714566
--- a/onnx/decoder_model_merged.onnx
+++ b/onnx/decoder_model_merged.onnx
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:23181b6bf46b7a826bd35b4b6a9e6c051b349b621046518cca2dbe8124e36144
+size 655237634
--- a/onnx/decoder_with_past_model.onnx
+++ b/onnx/decoder_with_past_model.onnx
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7b62383abe397f9084d1b1a4e294101076a94f9ec7f153d6e9f69c18843c75da
+size 653721372
--- a/onnx/generation_config.json
+++ b/onnx/generation_config.json
@@ -0,0 +1,6 @@
+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.34.0"
+}
--- a/onnx/merges.txt
+++ b/onnx/merges.txt
--- a/onnx/special_tokens_map.json
+++ b/onnx/special_tokens_map.json
@@ -0,0 +1,5 @@
+{
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
+}
--- a/onnx/tokenizer.json
+++ b/onnx/tokenizer.json
--- a/onnx/tokenizer_config.json
+++ b/onnx/tokenizer_config.json
@@ -0,0 +1,60 @@
+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "<mask>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50265": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [],
+  "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "<|endoftext|>",
+  "model_max_length": 1000000000000000019884624838656,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}
--- a/onnx/vocab.json
+++ b/onnx/vocab.json
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3831e1a426b431acdb930bbba251d6595b4ef4d2a7c903db0a60f85a92ab070a
+size 510428686
--- a/tf_model.h5
+++ b/tf_model.h5
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:848c3169b5f2bcbb966462d0c390802a767c0ce52dc355b7cd5db9d507e64c76
+size 497959496
--- a/tokenizer.json
+++ b/tokenizer.json
				`@@ -0,0 +1 @@`
				`{"framework": "pytorch", "task": "text-generation", "allow_remote": true}`