初始化项目,由ModelHub XC社区提供模型

Model: GroNLP/gpt2-medium-italian-embeddings
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-04 18:09:16 +08:00
commit fa846617b1
11 changed files with 29881 additions and 0 deletions

10
.gitattributes vendored Normal file
View File

@@ -0,0 +1,10 @@
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tar.gz filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
model.safetensors filter=lfs diff=lfs merge=lfs -text

63
README.md Normal file
View File

@@ -0,0 +1,63 @@
---
language: it
tags:
- adaption
- recycled
- gpt2-medium
pipeline_tag: text-generation
---
# GPT-2 recycled for Italian (medium, adapted lexical embeddings)
[Wietse de Vries](https://www.semanticscholar.org/author/Wietse-de-Vries/144611157) •
[Malvina Nissim](https://www.semanticscholar.org/author/M.-Nissim/2742475)
## Model description
This model is based on the medium OpenAI GPT-2 ([`gpt2-medium`](https://huggingface.co/gpt2-medium)) model.
The Transformer layer weights in this model are identical to the original English, model but the lexical layer has been retrained for an Italian vocabulary.
For details, check out our paper on [arXiv](https://arxiv.org/abs/2012.05628) and the code on [Github](https://github.com/wietsedv/gpt2-recycle).
## Related models
### Dutch
- [`gpt2-small-dutch-embeddings`](https://huggingface.co/GroNLP/gpt2-small-dutch-embeddings): Small model size with only retrained lexical embeddings.
- [`gpt2-small-dutch`](https://huggingface.co/GroNLP/gpt2-small-dutch): Small model size with retrained lexical embeddings and additional fine-tuning of the full model. (**Recommended**)
- [`gpt2-medium-dutch-embeddings`](https://huggingface.co/GroNLP/gpt2-medium-dutch-embeddings): Medium model size with only retrained lexical embeddings.
### Italian
- [`gpt2-small-italian-embeddings`](https://huggingface.co/GroNLP/gpt2-small-italian-embeddings): Small model size with only retrained lexical embeddings.
- [`gpt2-small-italian`](https://huggingface.co/GroNLP/gpt2-small-italian): Small model size with retrained lexical embeddings and additional fine-tuning of the full model. (**Recommended**)
- [`gpt2-medium-italian-embeddings`](https://huggingface.co/GroNLP/gpt2-medium-italian-embeddings): Medium model size with only retrained lexical embeddings.
## How to use
```python
from transformers import pipeline
pipe = pipeline("text-generation", model="GroNLP/gpt2-medium-italian-embeddings")
```
```python
from transformers import AutoTokenizer, AutoModel, TFAutoModel
tokenizer = AutoTokenizer.from_pretrained("GroNLP/gpt2-medium-italian-embeddings")
model = AutoModel.from_pretrained("GroNLP/gpt2-medium-italian-embeddings") # PyTorch
model = TFAutoModel.from_pretrained("GroNLP/gpt2-medium-italian-embeddings") # Tensorflow
```
## BibTeX entry
```bibtex
@misc{devries2020good,
title={As good as new. How to successfully recycle English GPT-2 to make models for other languages},
author={Wietse de Vries and Malvina Nissim},
year={2020},
eprint={2012.05628},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```

44
config.json Normal file
View File

@@ -0,0 +1,44 @@
{
"_name_or_path": "data/hf/gpt2-medium-italian-embeddings",
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 0,
"embd_pdrop": 0.1,
"eos_token_id": 0,
"gradient_checkpointing": false,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 1024,
"n_head": 16,
"n_inner": null,
"n_layer": 24,
"n_positions": 1024,
"n_special": 0,
"pad_token_id": 0,
"predict_special_tokens": true,
"resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 100,
"no_repeat_ngram_size": 4,
"num_beams": 10,
"repetition_penalty": 10.0,
"temperature": 2.0,
"top_k": 20,
"top_p": 0.9
}
},
"use_cache": true,
"vocab_size": 30001
}

3
flax_model.msgpack Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:988847eabc259a96754577f81714c816cf334823a035b90795e173c8c5f5dde7
size 1336333726

29744
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8a0ef6acfe1e01819e6f273dfca7e3c7095843efe9564eaf4757c2fe04eef1bc
size 1361525176

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b34b641ae618294da7c10341635f7494fe5b3e96792fe07cdec0e8d8feee9eed
size 1361612080

1
special_tokens_map.json Normal file
View File

@@ -0,0 +1 @@
{"bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "unk_token": "<|endoftext|>"}

3
tf_model.h5 Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c51b2934c256a59856bcf5ca8c25d88739d30ba095173e7b2782883365269f8d
size 1336658064

6
tokenizer_config.json Normal file
View File

@@ -0,0 +1,6 @@
{
"unk_token": "<|endoftext|>",
"bos_token": "<|endoftext|>",
"eos_token": "<|endoftext|>",
"add_prefix_space": false
}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long