初始化项目,由ModelHub XC社区提供模型
Model: dbmdz/german-gpt2 Source: Original Platform
This commit is contained in:
10
.gitattributes
vendored
Normal file
10
.gitattributes
vendored
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
*.bin.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
model.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
90
README.md
Normal file
90
README.md
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
---
|
||||||
|
language: de
|
||||||
|
|
||||||
|
widget:
|
||||||
|
- text: "Heute ist sehr schönes Wetter in"
|
||||||
|
|
||||||
|
license: mit
|
||||||
|
---
|
||||||
|
|
||||||
|
# German GPT-2 model
|
||||||
|
|
||||||
|
In this repository we release (yet another) GPT-2 model, that was trained on various texts for German.
|
||||||
|
|
||||||
|
The model is meant to be an entry point for fine-tuning on other texts, and it is definitely not as good or "dangerous" as the English GPT-3 model. We do not plan extensive PR or staged releases for this model 😉
|
||||||
|
|
||||||
|
**Note**: The model was initially released under an anonymous alias (`anonymous-german-nlp/german-gpt2`) so we now "de-anonymize" it.
|
||||||
|
|
||||||
|
More details about GPT-2 can be found in the great [Hugging Face](https://huggingface.co/transformers/model_doc/gpt2.html) documentation.
|
||||||
|
|
||||||
|
# Changelog
|
||||||
|
|
||||||
|
16.08.2021: Public release of re-trained version of our German GPT-2 model with better results.
|
||||||
|
|
||||||
|
15.11.2020: Initial release. Please use the tag `v1.0` for [this older version](https://huggingface.co/dbmdz/german-gpt2/tree/v1.0).
|
||||||
|
|
||||||
|
# Training corpora
|
||||||
|
|
||||||
|
We use pretty much the same corpora as used for training the DBMDZ BERT model, that can be found in [this repository](https://github.com/dbmdz/berts).
|
||||||
|
|
||||||
|
Thanks to the awesome Hugging Face team, it is possible to create byte-level BPE with their awesome [Tokenizers](https://github.com/huggingface/tokenizers) library.
|
||||||
|
|
||||||
|
With the previously mentioned awesome Tokenizers library we created a 50K byte-level BPE vocab based on the training corpora.
|
||||||
|
|
||||||
|
After creating the vocab, we could train the GPT-2 for German on a v3-8 TPU over the complete training corpus for 20 epochs. All hyperparameters
|
||||||
|
can be found in the official JAX/FLAX documentation [here](https://github.com/huggingface/transformers/blob/master/examples/flax/language-modeling/README.md)
|
||||||
|
from Transformers.
|
||||||
|
|
||||||
|
# Using the model
|
||||||
|
|
||||||
|
The model itself can be used in this way:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import AutoTokenizer, AutoModelWithLMHead
|
||||||
|
|
||||||
|
tokenizer = AutoTokenizer.from_pretrained("dbmdz/german-gpt2")
|
||||||
|
|
||||||
|
model = AutoModelWithLMHead.from_pretrained("dbmdz/german-gpt2")
|
||||||
|
```
|
||||||
|
|
||||||
|
However, text generation is a bit more interesting, so here's an example that shows how to use the great Transformers *Pipelines* for generating text:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import pipeline
|
||||||
|
|
||||||
|
pipe = pipeline('text-generation', model="dbmdz/german-gpt2",
|
||||||
|
tokenizer="dbmdz/german-gpt2")
|
||||||
|
|
||||||
|
text = pipe("Der Sinn des Lebens ist es", max_length=100)[0]["generated_text"]
|
||||||
|
|
||||||
|
print(text)
|
||||||
|
```
|
||||||
|
|
||||||
|
This could output this beautiful text:
|
||||||
|
|
||||||
|
```
|
||||||
|
Der Sinn des Lebens ist es, im Geist zu verweilen, aber nicht in der Welt zu sein, sondern ganz im Geist zu leben.
|
||||||
|
Die Menschen beginnen, sich nicht nach der Natur und nach der Welt zu richten, sondern nach der Seele,'
|
||||||
|
```
|
||||||
|
|
||||||
|
# License
|
||||||
|
|
||||||
|
All models are licensed under [MIT](LICENSE).
|
||||||
|
|
||||||
|
# Huggingface model hub
|
||||||
|
|
||||||
|
All models are available on the [Huggingface model hub](https://huggingface.co/dbmdz).
|
||||||
|
|
||||||
|
# Contact (Bugs, Feedback, Contribution and more)
|
||||||
|
|
||||||
|
For questions about our BERT models just open an issue
|
||||||
|
[here](https://github.com/stefan-it/german-gpt/issues/new) 🤗
|
||||||
|
|
||||||
|
# Acknowledgments
|
||||||
|
|
||||||
|
Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC).
|
||||||
|
Thanks for providing access to the TFRC ❤️
|
||||||
|
|
||||||
|
Thanks to the generous support from the [Hugging Face](https://huggingface.co/) team,
|
||||||
|
it is possible to download both cased and uncased models from their S3 storage 🤗
|
||||||
|
|
||||||
38
config.json
Normal file
38
config.json
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
{
|
||||||
|
"_name_or_path": "./",
|
||||||
|
"activation_function": "gelu_new",
|
||||||
|
"architectures": [
|
||||||
|
"GPT2LMHeadModel"
|
||||||
|
],
|
||||||
|
"attn_pdrop": 0.0,
|
||||||
|
"bos_token_id": 50256,
|
||||||
|
"embd_pdrop": 0.0,
|
||||||
|
"eos_token_id": 50256,
|
||||||
|
"gradient_checkpointing": false,
|
||||||
|
"initializer_range": 0.02,
|
||||||
|
"layer_norm_epsilon": 1e-05,
|
||||||
|
"model_type": "gpt2",
|
||||||
|
"n_ctx": 1024,
|
||||||
|
"n_embd": 768,
|
||||||
|
"n_head": 12,
|
||||||
|
"n_inner": null,
|
||||||
|
"n_layer": 12,
|
||||||
|
"n_positions": 1024,
|
||||||
|
"resid_pdrop": 0.0,
|
||||||
|
"scale_attn_weights": true,
|
||||||
|
"summary_activation": null,
|
||||||
|
"summary_first_dropout": 0.1,
|
||||||
|
"summary_proj_to_labels": true,
|
||||||
|
"summary_type": "cls_index",
|
||||||
|
"summary_use_proj": true,
|
||||||
|
"task_specific_params": {
|
||||||
|
"text-generation": {
|
||||||
|
"do_sample": true,
|
||||||
|
"max_length": 50
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"torch_dtype": "float32",
|
||||||
|
"transformers_version": "4.10.0.dev0",
|
||||||
|
"use_cache": true,
|
||||||
|
"vocab_size": 50265
|
||||||
|
}
|
||||||
1
configuration.json
Normal file
1
configuration.json
Normal file
@@ -0,0 +1 @@
|
|||||||
|
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}
|
||||||
3
flax_model.msgpack
Normal file
3
flax_model.msgpack
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:e436c321647a102f787ae270289a9afe701c7149ccc95fc6569e15be76f9e30a
|
||||||
|
size 497788696
|
||||||
3
model.safetensors
Normal file
3
model.safetensors
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:161248341459a25ea9e358f31bbc767339ecc59a859424ceb0061e2402d9f7f8
|
||||||
|
size 510384174
|
||||||
8
onnx/added_tokens.json
Normal file
8
onnx/added_tokens.json
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
{
|
||||||
|
"</s>": 2,
|
||||||
|
"<mask>": 4,
|
||||||
|
"<pad>": 1,
|
||||||
|
"<s>": 0,
|
||||||
|
"<unk>": 3,
|
||||||
|
"<|endoftext|>": 50265
|
||||||
|
}
|
||||||
39
onnx/config.json
Normal file
39
onnx/config.json
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
{
|
||||||
|
"_name_or_path": "dbmdz/german-gpt2",
|
||||||
|
"activation_function": "gelu_new",
|
||||||
|
"architectures": [
|
||||||
|
"GPT2LMHeadModel"
|
||||||
|
],
|
||||||
|
"attn_pdrop": 0.0,
|
||||||
|
"bos_token_id": 50256,
|
||||||
|
"embd_pdrop": 0.0,
|
||||||
|
"eos_token_id": 50256,
|
||||||
|
"gradient_checkpointing": false,
|
||||||
|
"initializer_range": 0.02,
|
||||||
|
"layer_norm_epsilon": 1e-05,
|
||||||
|
"model_type": "gpt2",
|
||||||
|
"n_ctx": 1024,
|
||||||
|
"n_embd": 768,
|
||||||
|
"n_head": 12,
|
||||||
|
"n_inner": null,
|
||||||
|
"n_layer": 12,
|
||||||
|
"n_positions": 1024,
|
||||||
|
"reorder_and_upcast_attn": false,
|
||||||
|
"resid_pdrop": 0.0,
|
||||||
|
"scale_attn_by_inverse_layer_idx": false,
|
||||||
|
"scale_attn_weights": true,
|
||||||
|
"summary_activation": null,
|
||||||
|
"summary_first_dropout": 0.1,
|
||||||
|
"summary_proj_to_labels": true,
|
||||||
|
"summary_type": "cls_index",
|
||||||
|
"summary_use_proj": true,
|
||||||
|
"task_specific_params": {
|
||||||
|
"text-generation": {
|
||||||
|
"do_sample": true,
|
||||||
|
"max_length": 50
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"transformers_version": "4.34.0",
|
||||||
|
"use_cache": true,
|
||||||
|
"vocab_size": 50265
|
||||||
|
}
|
||||||
3
onnx/decoder_model.onnx
Normal file
3
onnx/decoder_model.onnx
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:bcc909a0656c80266871411ce1f0750d683e6bc9aa822ae47ffcb014cf704dc8
|
||||||
|
size 653714566
|
||||||
3
onnx/decoder_model_merged.onnx
Normal file
3
onnx/decoder_model_merged.onnx
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:23181b6bf46b7a826bd35b4b6a9e6c051b349b621046518cca2dbe8124e36144
|
||||||
|
size 655237634
|
||||||
3
onnx/decoder_with_past_model.onnx
Normal file
3
onnx/decoder_with_past_model.onnx
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:7b62383abe397f9084d1b1a4e294101076a94f9ec7f153d6e9f69c18843c75da
|
||||||
|
size 653721372
|
||||||
6
onnx/generation_config.json
Normal file
6
onnx/generation_config.json
Normal file
@@ -0,0 +1,6 @@
|
|||||||
|
{
|
||||||
|
"_from_model_config": true,
|
||||||
|
"bos_token_id": 50256,
|
||||||
|
"eos_token_id": 50256,
|
||||||
|
"transformers_version": "4.34.0"
|
||||||
|
}
|
||||||
50005
onnx/merges.txt
Normal file
50005
onnx/merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
5
onnx/special_tokens_map.json
Normal file
5
onnx/special_tokens_map.json
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
{
|
||||||
|
"bos_token": "<|endoftext|>",
|
||||||
|
"eos_token": "<|endoftext|>",
|
||||||
|
"unk_token": "<|endoftext|>"
|
||||||
|
}
|
||||||
100362
onnx/tokenizer.json
Normal file
100362
onnx/tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
60
onnx/tokenizer_config.json
Normal file
60
onnx/tokenizer_config.json
Normal file
@@ -0,0 +1,60 @@
|
|||||||
|
{
|
||||||
|
"add_prefix_space": false,
|
||||||
|
"added_tokens_decoder": {
|
||||||
|
"0": {
|
||||||
|
"content": "<s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"1": {
|
||||||
|
"content": "<pad>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"2": {
|
||||||
|
"content": "</s>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"3": {
|
||||||
|
"content": "<unk>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"4": {
|
||||||
|
"content": "<mask>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"50265": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"additional_special_tokens": [],
|
||||||
|
"bos_token": "<|endoftext|>",
|
||||||
|
"clean_up_tokenization_spaces": true,
|
||||||
|
"eos_token": "<|endoftext|>",
|
||||||
|
"model_max_length": 1000000000000000019884624838656,
|
||||||
|
"tokenizer_class": "GPT2Tokenizer",
|
||||||
|
"unk_token": "<|endoftext|>"
|
||||||
|
}
|
||||||
1
onnx/vocab.json
Normal file
1
onnx/vocab.json
Normal file
File diff suppressed because one or more lines are too long
3
pytorch_model.bin
Normal file
3
pytorch_model.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:3831e1a426b431acdb930bbba251d6595b4ef4d2a7c903db0a60f85a92ab070a
|
||||||
|
size 510428686
|
||||||
3
tf_model.h5
Normal file
3
tf_model.h5
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:848c3169b5f2bcbb966462d0c390802a767c0ce52dc355b7cd5db9d507e64c76
|
||||||
|
size 497959496
|
||||||
1
tokenizer.json
Normal file
1
tokenizer.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user