初始化项目,由ModelHub XC社区提供模型
Model: dbmdz/german-gpt2 Source: Original Platform
This commit is contained in:
10
.gitattributes
vendored
Normal file
10
.gitattributes
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
*.bin.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
model.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
90
README.md
Normal file
90
README.md
Normal file
@@ -0,0 +1,90 @@
|
||||
---
|
||||
language: de
|
||||
|
||||
widget:
|
||||
- text: "Heute ist sehr schönes Wetter in"
|
||||
|
||||
license: mit
|
||||
---
|
||||
|
||||
# German GPT-2 model
|
||||
|
||||
In this repository we release (yet another) GPT-2 model, that was trained on various texts for German.
|
||||
|
||||
The model is meant to be an entry point for fine-tuning on other texts, and it is definitely not as good or "dangerous" as the English GPT-3 model. We do not plan extensive PR or staged releases for this model 😉
|
||||
|
||||
**Note**: The model was initially released under an anonymous alias (`anonymous-german-nlp/german-gpt2`) so we now "de-anonymize" it.
|
||||
|
||||
More details about GPT-2 can be found in the great [Hugging Face](https://huggingface.co/transformers/model_doc/gpt2.html) documentation.
|
||||
|
||||
# Changelog
|
||||
|
||||
16.08.2021: Public release of re-trained version of our German GPT-2 model with better results.
|
||||
|
||||
15.11.2020: Initial release. Please use the tag `v1.0` for [this older version](https://huggingface.co/dbmdz/german-gpt2/tree/v1.0).
|
||||
|
||||
# Training corpora
|
||||
|
||||
We use pretty much the same corpora as used for training the DBMDZ BERT model, that can be found in [this repository](https://github.com/dbmdz/berts).
|
||||
|
||||
Thanks to the awesome Hugging Face team, it is possible to create byte-level BPE with their awesome [Tokenizers](https://github.com/huggingface/tokenizers) library.
|
||||
|
||||
With the previously mentioned awesome Tokenizers library we created a 50K byte-level BPE vocab based on the training corpora.
|
||||
|
||||
After creating the vocab, we could train the GPT-2 for German on a v3-8 TPU over the complete training corpus for 20 epochs. All hyperparameters
|
||||
can be found in the official JAX/FLAX documentation [here](https://github.com/huggingface/transformers/blob/master/examples/flax/language-modeling/README.md)
|
||||
from Transformers.
|
||||
|
||||
# Using the model
|
||||
|
||||
The model itself can be used in this way:
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelWithLMHead
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained("dbmdz/german-gpt2")
|
||||
|
||||
model = AutoModelWithLMHead.from_pretrained("dbmdz/german-gpt2")
|
||||
```
|
||||
|
||||
However, text generation is a bit more interesting, so here's an example that shows how to use the great Transformers *Pipelines* for generating text:
|
||||
|
||||
```python
|
||||
from transformers import pipeline
|
||||
|
||||
pipe = pipeline('text-generation', model="dbmdz/german-gpt2",
|
||||
tokenizer="dbmdz/german-gpt2")
|
||||
|
||||
text = pipe("Der Sinn des Lebens ist es", max_length=100)[0]["generated_text"]
|
||||
|
||||
print(text)
|
||||
```
|
||||
|
||||
This could output this beautiful text:
|
||||
|
||||
```
|
||||
Der Sinn des Lebens ist es, im Geist zu verweilen, aber nicht in der Welt zu sein, sondern ganz im Geist zu leben.
|
||||
Die Menschen beginnen, sich nicht nach der Natur und nach der Welt zu richten, sondern nach der Seele,'
|
||||
```
|
||||
|
||||
# License
|
||||
|
||||
All models are licensed under [MIT](LICENSE).
|
||||
|
||||
# Huggingface model hub
|
||||
|
||||
All models are available on the [Huggingface model hub](https://huggingface.co/dbmdz).
|
||||
|
||||
# Contact (Bugs, Feedback, Contribution and more)
|
||||
|
||||
For questions about our BERT models just open an issue
|
||||
[here](https://github.com/stefan-it/german-gpt/issues/new) 🤗
|
||||
|
||||
# Acknowledgments
|
||||
|
||||
Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC).
|
||||
Thanks for providing access to the TFRC ❤️
|
||||
|
||||
Thanks to the generous support from the [Hugging Face](https://huggingface.co/) team,
|
||||
it is possible to download both cased and uncased models from their S3 storage 🤗
|
||||
|
||||
38
config.json
Normal file
38
config.json
Normal file
@@ -0,0 +1,38 @@
|
||||
{
|
||||
"_name_or_path": "./",
|
||||
"activation_function": "gelu_new",
|
||||
"architectures": [
|
||||
"GPT2LMHeadModel"
|
||||
],
|
||||
"attn_pdrop": 0.0,
|
||||
"bos_token_id": 50256,
|
||||
"embd_pdrop": 0.0,
|
||||
"eos_token_id": 50256,
|
||||
"gradient_checkpointing": false,
|
||||
"initializer_range": 0.02,
|
||||
"layer_norm_epsilon": 1e-05,
|
||||
"model_type": "gpt2",
|
||||
"n_ctx": 1024,
|
||||
"n_embd": 768,
|
||||
"n_head": 12,
|
||||
"n_inner": null,
|
||||
"n_layer": 12,
|
||||
"n_positions": 1024,
|
||||
"resid_pdrop": 0.0,
|
||||
"scale_attn_weights": true,
|
||||
"summary_activation": null,
|
||||
"summary_first_dropout": 0.1,
|
||||
"summary_proj_to_labels": true,
|
||||
"summary_type": "cls_index",
|
||||
"summary_use_proj": true,
|
||||
"task_specific_params": {
|
||||
"text-generation": {
|
||||
"do_sample": true,
|
||||
"max_length": 50
|
||||
}
|
||||
},
|
||||
"torch_dtype": "float32",
|
||||
"transformers_version": "4.10.0.dev0",
|
||||
"use_cache": true,
|
||||
"vocab_size": 50265
|
||||
}
|
||||
1
configuration.json
Normal file
1
configuration.json
Normal file
@@ -0,0 +1 @@
|
||||
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}
|
||||
3
flax_model.msgpack
Normal file
3
flax_model.msgpack
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e436c321647a102f787ae270289a9afe701c7149ccc95fc6569e15be76f9e30a
|
||||
size 497788696
|
||||
3
model.safetensors
Normal file
3
model.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:161248341459a25ea9e358f31bbc767339ecc59a859424ceb0061e2402d9f7f8
|
||||
size 510384174
|
||||
8
onnx/added_tokens.json
Normal file
8
onnx/added_tokens.json
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"</s>": 2,
|
||||
"<mask>": 4,
|
||||
"<pad>": 1,
|
||||
"<s>": 0,
|
||||
"<unk>": 3,
|
||||
"<|endoftext|>": 50265
|
||||
}
|
||||
39
onnx/config.json
Normal file
39
onnx/config.json
Normal file
@@ -0,0 +1,39 @@
|
||||
{
|
||||
"_name_or_path": "dbmdz/german-gpt2",
|
||||
"activation_function": "gelu_new",
|
||||
"architectures": [
|
||||
"GPT2LMHeadModel"
|
||||
],
|
||||
"attn_pdrop": 0.0,
|
||||
"bos_token_id": 50256,
|
||||
"embd_pdrop": 0.0,
|
||||
"eos_token_id": 50256,
|
||||
"gradient_checkpointing": false,
|
||||
"initializer_range": 0.02,
|
||||
"layer_norm_epsilon": 1e-05,
|
||||
"model_type": "gpt2",
|
||||
"n_ctx": 1024,
|
||||
"n_embd": 768,
|
||||
"n_head": 12,
|
||||
"n_inner": null,
|
||||
"n_layer": 12,
|
||||
"n_positions": 1024,
|
||||
"reorder_and_upcast_attn": false,
|
||||
"resid_pdrop": 0.0,
|
||||
"scale_attn_by_inverse_layer_idx": false,
|
||||
"scale_attn_weights": true,
|
||||
"summary_activation": null,
|
||||
"summary_first_dropout": 0.1,
|
||||
"summary_proj_to_labels": true,
|
||||
"summary_type": "cls_index",
|
||||
"summary_use_proj": true,
|
||||
"task_specific_params": {
|
||||
"text-generation": {
|
||||
"do_sample": true,
|
||||
"max_length": 50
|
||||
}
|
||||
},
|
||||
"transformers_version": "4.34.0",
|
||||
"use_cache": true,
|
||||
"vocab_size": 50265
|
||||
}
|
||||
3
onnx/decoder_model.onnx
Normal file
3
onnx/decoder_model.onnx
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:bcc909a0656c80266871411ce1f0750d683e6bc9aa822ae47ffcb014cf704dc8
|
||||
size 653714566
|
||||
3
onnx/decoder_model_merged.onnx
Normal file
3
onnx/decoder_model_merged.onnx
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:23181b6bf46b7a826bd35b4b6a9e6c051b349b621046518cca2dbe8124e36144
|
||||
size 655237634
|
||||
3
onnx/decoder_with_past_model.onnx
Normal file
3
onnx/decoder_with_past_model.onnx
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7b62383abe397f9084d1b1a4e294101076a94f9ec7f153d6e9f69c18843c75da
|
||||
size 653721372
|
||||
6
onnx/generation_config.json
Normal file
6
onnx/generation_config.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"_from_model_config": true,
|
||||
"bos_token_id": 50256,
|
||||
"eos_token_id": 50256,
|
||||
"transformers_version": "4.34.0"
|
||||
}
|
||||
50005
onnx/merges.txt
Normal file
50005
onnx/merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
5
onnx/special_tokens_map.json
Normal file
5
onnx/special_tokens_map.json
Normal file
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"bos_token": "<|endoftext|>",
|
||||
"eos_token": "<|endoftext|>",
|
||||
"unk_token": "<|endoftext|>"
|
||||
}
|
||||
100362
onnx/tokenizer.json
Normal file
100362
onnx/tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
60
onnx/tokenizer_config.json
Normal file
60
onnx/tokenizer_config.json
Normal file
@@ -0,0 +1,60 @@
|
||||
{
|
||||
"add_prefix_space": false,
|
||||
"added_tokens_decoder": {
|
||||
"0": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"1": {
|
||||
"content": "<pad>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"2": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"3": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"4": {
|
||||
"content": "<mask>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"50265": {
|
||||
"content": "<|endoftext|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [],
|
||||
"bos_token": "<|endoftext|>",
|
||||
"clean_up_tokenization_spaces": true,
|
||||
"eos_token": "<|endoftext|>",
|
||||
"model_max_length": 1000000000000000019884624838656,
|
||||
"tokenizer_class": "GPT2Tokenizer",
|
||||
"unk_token": "<|endoftext|>"
|
||||
}
|
||||
1
onnx/vocab.json
Normal file
1
onnx/vocab.json
Normal file
File diff suppressed because one or more lines are too long
3
pytorch_model.bin
Normal file
3
pytorch_model.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3831e1a426b431acdb930bbba251d6595b4ef4d2a7c903db0a60f85a92ab070a
|
||||
size 510428686
|
||||
3
tf_model.h5
Normal file
3
tf_model.h5
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:848c3169b5f2bcbb966462d0c390802a767c0ce52dc355b7cd5db9d507e64c76
|
||||
size 497959496
|
||||
1
tokenizer.json
Normal file
1
tokenizer.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user