初始化项目，由ModelHub XC社区提供模型

Model: GroNLP/gpt2-medium-italian-embeddings Source: Original Platform
2026-06-04 18:09:16 +08:00
commit fa846617b1
11 changed files with 29881 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,10 @@
 *.bin.* filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tar.gz filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 model.safetensors filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,63 @@
 ---
 language: it
 tags:
 - adaption
 - recycled
 - gpt2-medium
 pipeline_tag: text-generation
 ---
 # GPT-2 recycled for Italian (medium, adapted lexical embeddings)
 [Wietse de Vries](https://www.semanticscholar.org/author/Wietse-de-Vries/144611157) •
 [Malvina Nissim](https://www.semanticscholar.org/author/M.-Nissim/2742475)
 ## Model description
 This model is based on the medium OpenAI GPT-2 ([`gpt2-medium`](https://huggingface.co/gpt2-medium)) model.
 The Transformer layer weights in this model are identical to the original English, model but the lexical layer has been retrained for an Italian vocabulary.
 For details, check out our paper on [arXiv](https://arxiv.org/abs/2012.05628) and the code on [Github](https://github.com/wietsedv/gpt2-recycle).
 ## Related models
 ### Dutch
 - [`gpt2-small-dutch-embeddings`](https://huggingface.co/GroNLP/gpt2-small-dutch-embeddings): Small model size with only retrained lexical embeddings.
 - [`gpt2-small-dutch`](https://huggingface.co/GroNLP/gpt2-small-dutch):  Small model size with retrained lexical embeddings and additional fine-tuning of the full model. (**Recommended**)
 - [`gpt2-medium-dutch-embeddings`](https://huggingface.co/GroNLP/gpt2-medium-dutch-embeddings): Medium model size with only retrained lexical embeddings.
 ### Italian
 - [`gpt2-small-italian-embeddings`](https://huggingface.co/GroNLP/gpt2-small-italian-embeddings): Small model size with only retrained lexical embeddings.
 - [`gpt2-small-italian`](https://huggingface.co/GroNLP/gpt2-small-italian):  Small model size with retrained lexical embeddings and additional fine-tuning of the full model. (**Recommended**)
 - [`gpt2-medium-italian-embeddings`](https://huggingface.co/GroNLP/gpt2-medium-italian-embeddings): Medium model size with only retrained lexical embeddings.
 ## How to use
 ```python
 from transformers import pipeline
 pipe = pipeline("text-generation", model="GroNLP/gpt2-medium-italian-embeddings")
 ```
 ```python
 from transformers import AutoTokenizer, AutoModel, TFAutoModel
 tokenizer = AutoTokenizer.from_pretrained("GroNLP/gpt2-medium-italian-embeddings")
 model = AutoModel.from_pretrained("GroNLP/gpt2-medium-italian-embeddings")  # PyTorch
 model = TFAutoModel.from_pretrained("GroNLP/gpt2-medium-italian-embeddings")  # Tensorflow
 ```
 ## BibTeX entry
 ```bibtex
@misc{devries2020good,
      title={As good as new. How to successfully recycle English GPT-2 to make models for other languages}, 
      author={Wietse de Vries and Malvina Nissim},
      year={2020},
      eprint={2012.05628},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
 }
 ```
--- a/config.json
+++ b/config.json
@@ -0,0 +1,44 @@
 {
  "_name_or_path": "data/hf/gpt2-medium-italian-embeddings",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 0,
  "embd_pdrop": 0.1,
  "eos_token_id": 0,
  "gradient_checkpointing": false,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_inner": null,
  "n_layer": 24,
  "n_positions": 1024,
  "n_special": 0,
  "pad_token_id": 0,
  "predict_special_tokens": true,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 100,
      "no_repeat_ngram_size": 4,
      "num_beams": 10,
      "repetition_penalty": 10.0,
      "temperature": 2.0,
      "top_k": 20,
      "top_p": 0.9
    }
  },
  "use_cache": true,
  "vocab_size": 30001
 }
--- a/flax_model.msgpack
+++ b/flax_model.msgpack
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:988847eabc259a96754577f81714c816cf334823a035b90795e173c8c5f5dde7
 size 1336333726
--- a/merges.txt
+++ b/merges.txt
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:8a0ef6acfe1e01819e6f273dfca7e3c7095843efe9564eaf4757c2fe04eef1bc
 size 1361525176
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:b34b641ae618294da7c10341635f7494fe5b3e96792fe07cdec0e8d8feee9eed
 size 1361612080
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1 @@
 {"bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "unk_token": "<|endoftext|>"}
--- a/tf_model.h5
+++ b/tf_model.h5
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c51b2934c256a59856bcf5ca8c25d88739d30ba095173e7b2782883365269f8d
 size 1336658064
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,6 @@
 {
    "unk_token": "<|endoftext|>",
    "bos_token": "<|endoftext|>",
    "eos_token": "<|endoftext|>",
    "add_prefix_space": false
 }
--- a/vocab.json
+++ b/vocab.json
		`@@ -0,0 +1 @@`
							`{"bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "unk_token": "<\|endoftext\|>"}`