初始化项目,由ModelHub XC社区提供模型

Model: dbmdz/german-gpt2
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-09 04:32:13 +08:00
commit e1f914d572
20 changed files with 150647 additions and 0 deletions

10
.gitattributes vendored Normal file
View File

@@ -0,0 +1,10 @@
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tar.gz filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
model.safetensors filter=lfs diff=lfs merge=lfs -text

90
README.md Normal file
View File

@@ -0,0 +1,90 @@
---
language: de
widget:
- text: "Heute ist sehr schönes Wetter in"
license: mit
---
# German GPT-2 model
In this repository we release (yet another) GPT-2 model, that was trained on various texts for German.
The model is meant to be an entry point for fine-tuning on other texts, and it is definitely not as good or "dangerous" as the English GPT-3 model. We do not plan extensive PR or staged releases for this model 😉
**Note**: The model was initially released under an anonymous alias (`anonymous-german-nlp/german-gpt2`) so we now "de-anonymize" it.
More details about GPT-2 can be found in the great [Hugging Face](https://huggingface.co/transformers/model_doc/gpt2.html) documentation.
# Changelog
16.08.2021: Public release of re-trained version of our German GPT-2 model with better results.
15.11.2020: Initial release. Please use the tag `v1.0` for [this older version](https://huggingface.co/dbmdz/german-gpt2/tree/v1.0).
# Training corpora
We use pretty much the same corpora as used for training the DBMDZ BERT model, that can be found in [this repository](https://github.com/dbmdz/berts).
Thanks to the awesome Hugging Face team, it is possible to create byte-level BPE with their awesome [Tokenizers](https://github.com/huggingface/tokenizers) library.
With the previously mentioned awesome Tokenizers library we created a 50K byte-level BPE vocab based on the training corpora.
After creating the vocab, we could train the GPT-2 for German on a v3-8 TPU over the complete training corpus for 20 epochs. All hyperparameters
can be found in the official JAX/FLAX documentation [here](https://github.com/huggingface/transformers/blob/master/examples/flax/language-modeling/README.md)
from Transformers.
# Using the model
The model itself can be used in this way:
```python
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("dbmdz/german-gpt2")
model = AutoModelWithLMHead.from_pretrained("dbmdz/german-gpt2")
```
However, text generation is a bit more interesting, so here's an example that shows how to use the great Transformers *Pipelines* for generating text:
```python
from transformers import pipeline
pipe = pipeline('text-generation', model="dbmdz/german-gpt2",
tokenizer="dbmdz/german-gpt2")
text = pipe("Der Sinn des Lebens ist es", max_length=100)[0]["generated_text"]
print(text)
```
This could output this beautiful text:
```
Der Sinn des Lebens ist es, im Geist zu verweilen, aber nicht in der Welt zu sein, sondern ganz im Geist zu leben.
Die Menschen beginnen, sich nicht nach der Natur und nach der Welt zu richten, sondern nach der Seele,'
```
# License
All models are licensed under [MIT](LICENSE).
# Huggingface model hub
All models are available on the [Huggingface model hub](https://huggingface.co/dbmdz).
# Contact (Bugs, Feedback, Contribution and more)
For questions about our BERT models just open an issue
[here](https://github.com/stefan-it/german-gpt/issues/new) 🤗
# Acknowledgments
Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC).
Thanks for providing access to the TFRC ❤️
Thanks to the generous support from the [Hugging Face](https://huggingface.co/) team,
it is possible to download both cased and uncased models from their S3 storage 🤗

38
config.json Normal file
View File

@@ -0,0 +1,38 @@
{
"_name_or_path": "./",
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.0,
"bos_token_id": 50256,
"embd_pdrop": 0.0,
"eos_token_id": 50256,
"gradient_checkpointing": false,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 768,
"n_head": 12,
"n_inner": null,
"n_layer": 12,
"n_positions": 1024,
"resid_pdrop": 0.0,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"torch_dtype": "float32",
"transformers_version": "4.10.0.dev0",
"use_cache": true,
"vocab_size": 50265
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

3
flax_model.msgpack Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e436c321647a102f787ae270289a9afe701c7149ccc95fc6569e15be76f9e30a
size 497788696

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:161248341459a25ea9e358f31bbc767339ecc59a859424ceb0061e2402d9f7f8
size 510384174

8
onnx/added_tokens.json Normal file
View File

@@ -0,0 +1,8 @@
{
"</s>": 2,
"<mask>": 4,
"<pad>": 1,
"<s>": 0,
"<unk>": 3,
"<|endoftext|>": 50265
}

39
onnx/config.json Normal file
View File

@@ -0,0 +1,39 @@
{
"_name_or_path": "dbmdz/german-gpt2",
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.0,
"bos_token_id": 50256,
"embd_pdrop": 0.0,
"eos_token_id": 50256,
"gradient_checkpointing": false,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 768,
"n_head": 12,
"n_inner": null,
"n_layer": 12,
"n_positions": 1024,
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.0,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"transformers_version": "4.34.0",
"use_cache": true,
"vocab_size": 50265
}

3
onnx/decoder_model.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bcc909a0656c80266871411ce1f0750d683e6bc9aa822ae47ffcb014cf704dc8
size 653714566

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:23181b6bf46b7a826bd35b4b6a9e6c051b349b621046518cca2dbe8124e36144
size 655237634

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7b62383abe397f9084d1b1a4e294101076a94f9ec7f153d6e9f69c18843c75da
size 653721372

View File

@@ -0,0 +1,6 @@
{
"_from_model_config": true,
"bos_token_id": 50256,
"eos_token_id": 50256,
"transformers_version": "4.34.0"
}

50005
onnx/merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,5 @@
{
"bos_token": "<|endoftext|>",
"eos_token": "<|endoftext|>",
"unk_token": "<|endoftext|>"
}

100362
onnx/tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,60 @@
{
"add_prefix_space": false,
"added_tokens_decoder": {
"0": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"3": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"4": {
"content": "<mask>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"50265": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [],
"bos_token": "<|endoftext|>",
"clean_up_tokenization_spaces": true,
"eos_token": "<|endoftext|>",
"model_max_length": 1000000000000000019884624838656,
"tokenizer_class": "GPT2Tokenizer",
"unk_token": "<|endoftext|>"
}

1
onnx/vocab.json Normal file

File diff suppressed because one or more lines are too long

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3831e1a426b431acdb930bbba251d6595b4ef4d2a7c903db0a60f85a92ab070a
size 510428686

3
tf_model.h5 Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:848c3169b5f2bcbb966462d0c390802a767c0ce52dc355b7cd5db9d507e64c76
size 497959496

1
tokenizer.json Normal file

File diff suppressed because one or more lines are too long