初始化项目，由ModelHub XC社区提供模型

Model: dbmdz/german-gpt2 Source: Original Platform
2026-06-09 04:32:13 +08:00
commit e1f914d572
20 changed files with 150647 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,10 @@
 *.bin.* filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tar.gz filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 model.safetensors filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,90 @@
 ---
 language: de
 widget:
 - text: "Heute ist sehr schönes Wetter in"
 license: mit
 ---
 # German GPT-2 model
 In this repository we release (yet another) GPT-2 model, that was trained on various texts for German.
 The model is meant to be an entry point for fine-tuning on other texts, and it is definitely not as good or "dangerous" as the English GPT-3 model. We do not plan extensive PR or staged releases for this model 😉
 **Note**: The model was initially released under an anonymous alias (`anonymous-german-nlp/german-gpt2`) so we now "de-anonymize" it.
 More details about GPT-2 can be found in the great [Hugging Face](https://huggingface.co/transformers/model_doc/gpt2.html) documentation.
 # Changelog
 16.08.2021: Public release of re-trained version of our German GPT-2 model with better results.
 15.11.2020: Initial release. Please use the tag `v1.0` for [this older version](https://huggingface.co/dbmdz/german-gpt2/tree/v1.0).
 # Training corpora
 We use pretty much the same corpora as used for training the DBMDZ BERT model, that can be found in [this repository](https://github.com/dbmdz/berts).
 Thanks to the awesome Hugging Face team, it is possible to create byte-level BPE with their awesome [Tokenizers](https://github.com/huggingface/tokenizers) library.
 With the previously mentioned awesome Tokenizers library we created a 50K byte-level BPE vocab based on the training corpora.
 After creating the vocab, we could train the GPT-2 for German on a v3-8 TPU over the complete training corpus for 20 epochs. All hyperparameters
 can be found in the official JAX/FLAX documentation [here](https://github.com/huggingface/transformers/blob/master/examples/flax/language-modeling/README.md)
 from Transformers.
 # Using the model
 The model itself can be used in this way:
 ```python
 from transformers import AutoTokenizer, AutoModelWithLMHead
 tokenizer = AutoTokenizer.from_pretrained("dbmdz/german-gpt2")
 model = AutoModelWithLMHead.from_pretrained("dbmdz/german-gpt2")
 ```
 However, text generation is a bit more interesting, so here's an example that shows how to use the great Transformers *Pipelines* for generating text:
 ```python
 from transformers import pipeline
 pipe = pipeline('text-generation', model="dbmdz/german-gpt2",
                 tokenizer="dbmdz/german-gpt2")
 text = pipe("Der Sinn des Lebens ist es", max_length=100)[0]["generated_text"]
 print(text)
 ```
 This could output this beautiful text:
 ```
 Der Sinn des Lebens ist es, im Geist zu verweilen, aber nicht in der Welt zu sein, sondern ganz im Geist zu leben.
 Die Menschen beginnen, sich nicht nach der Natur und nach der Welt zu richten, sondern nach der Seele,'
 ```
 # License
 All models are licensed under [MIT](LICENSE).
 # Huggingface model hub
 All models are available on the [Huggingface model hub](https://huggingface.co/dbmdz).
 # Contact (Bugs, Feedback, Contribution and more)
 For questions about our BERT models just open an issue
 [here](https://github.com/stefan-it/german-gpt/issues/new) 🤗
 # Acknowledgments
 Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC).
 Thanks for providing access to the TFRC ❤️
 Thanks to the generous support from the [Hugging Face](https://huggingface.co/) team,
 it is possible to download both cased and uncased models from their S3 storage 🤗
--- a/config.json
+++ b/config.json
@@ -0,0 +1,38 @@
 {
  "_name_or_path": "./",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.0,
  "bos_token_id": 50256,
  "embd_pdrop": 0.0,
  "eos_token_id": 50256,
  "gradient_checkpointing": false,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "resid_pdrop": 0.0,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "torch_dtype": "float32",
  "transformers_version": "4.10.0.dev0",
  "use_cache": true,
  "vocab_size": 50265
 }
--- a/configuration.json
+++ b/configuration.json
@@ -0,0 +1 @@
 {"framework": "pytorch", "task": "text-generation", "allow_remote": true}
--- a/flax_model.msgpack
+++ b/flax_model.msgpack
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:e436c321647a102f787ae270289a9afe701c7149ccc95fc6569e15be76f9e30a
 size 497788696
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:161248341459a25ea9e358f31bbc767339ecc59a859424ceb0061e2402d9f7f8
 size 510384174
--- a/onnx/added_tokens.json
+++ b/onnx/added_tokens.json
@@ -0,0 +1,8 @@
 {
  "</s>": 2,
  "<mask>": 4,
  "<pad>": 1,
  "<s>": 0,
  "<unk>": 3,
  "<|endoftext|>": 50265
 }
--- a/onnx/config.json
+++ b/onnx/config.json
@@ -0,0 +1,39 @@
 {
  "_name_or_path": "dbmdz/german-gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.0,
  "bos_token_id": 50256,
  "embd_pdrop": 0.0,
  "eos_token_id": 50256,
  "gradient_checkpointing": false,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.0,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.34.0",
  "use_cache": true,
  "vocab_size": 50265
 }
--- a/onnx/decoder_model.onnx
+++ b/onnx/decoder_model.onnx
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:bcc909a0656c80266871411ce1f0750d683e6bc9aa822ae47ffcb014cf704dc8
 size 653714566
--- a/onnx/decoder_model_merged.onnx
+++ b/onnx/decoder_model_merged.onnx
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:23181b6bf46b7a826bd35b4b6a9e6c051b349b621046518cca2dbe8124e36144
 size 655237634
--- a/onnx/decoder_with_past_model.onnx
+++ b/onnx/decoder_with_past_model.onnx
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:7b62383abe397f9084d1b1a4e294101076a94f9ec7f153d6e9f69c18843c75da
 size 653721372
--- a/onnx/generation_config.json
+++ b/onnx/generation_config.json
@@ -0,0 +1,6 @@
 {
  "_from_model_config": true,
  "bos_token_id": 50256,
  "eos_token_id": 50256,
  "transformers_version": "4.34.0"
 }
--- a/onnx/merges.txt
+++ b/onnx/merges.txt
--- a/onnx/special_tokens_map.json
+++ b/onnx/special_tokens_map.json
@@ -0,0 +1,5 @@
 {
  "bos_token": "<|endoftext|>",
  "eos_token": "<|endoftext|>",
  "unk_token": "<|endoftext|>"
 }
--- a/onnx/tokenizer.json
+++ b/onnx/tokenizer.json
--- a/onnx/tokenizer_config.json
+++ b/onnx/tokenizer_config.json
@@ -0,0 +1,60 @@
 {
  "add_prefix_space": false,
  "added_tokens_decoder": {
    "0": {
      "content": "<s>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "1": {
      "content": "<pad>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "2": {
      "content": "</s>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "3": {
      "content": "<unk>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "4": {
      "content": "<mask>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "50265": {
      "content": "<|endoftext|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    }
  },
  "additional_special_tokens": [],
  "bos_token": "<|endoftext|>",
  "clean_up_tokenization_spaces": true,
  "eos_token": "<|endoftext|>",
  "model_max_length": 1000000000000000019884624838656,
  "tokenizer_class": "GPT2Tokenizer",
  "unk_token": "<|endoftext|>"
 }
--- a/onnx/vocab.json
+++ b/onnx/vocab.json
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:3831e1a426b431acdb930bbba251d6595b4ef4d2a7c903db0a60f85a92ab070a
 size 510428686
--- a/tf_model.h5
+++ b/tf_model.h5
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:848c3169b5f2bcbb966462d0c390802a767c0ce52dc355b7cd5db9d507e64c76
 size 497959496
--- a/tokenizer.json
+++ b/tokenizer.json
		`@@ -0,0 +1 @@`
							`{"framework": "pytorch", "task": "text-generation", "allow_remote": true}`