初始化项目，由ModelHub XC社区提供模型

Model: GroNLP/gpt2-small-italian-embeddings Source: Original Platform
2026-06-07 04:44:17 +08:00
commit cff7bfa62a
11 changed files with 29879 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,10 @@
 *.bin.* filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tar.gz filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 model.safetensors filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,63 @@
 ---
 language: it
 tags:
 - adaption
 - recycled
 - gpt2-small
 pipeline_tag: text-generation
 ---
 # GPT-2 recycled for Italian (small, adapted lexical embeddings)
 [Wietse de Vries](https://www.semanticscholar.org/author/Wietse-de-Vries/144611157) •
 [Malvina Nissim](https://www.semanticscholar.org/author/M.-Nissim/2742475)
 ## Model description
 This model is based on the small OpenAI GPT-2 ([`gpt2`](https://huggingface.co/gpt2)) model.
 The Transformer layer weights in this model are identical to the original English, model but the lexical layer has been retrained for an Italian vocabulary.
 For details, check out our paper on [arXiv](https://arxiv.org/abs/2012.05628) and the code on [Github](https://github.com/wietsedv/gpt2-recycle).
 ## Related models
 ### Dutch
 - [`gpt2-small-dutch-embeddings`](https://huggingface.co/GroNLP/gpt2-small-dutch-embeddings): Small model size with only retrained lexical embeddings.
 - [`gpt2-small-dutch`](https://huggingface.co/GroNLP/gpt2-small-dutch):  Small model size with retrained lexical embeddings and additional fine-tuning of the full model. (**Recommended**)
 - [`gpt2-medium-dutch-embeddings`](https://huggingface.co/GroNLP/gpt2-medium-dutch-embeddings): Medium model size with only retrained lexical embeddings.
 ### Italian
 - [`gpt2-small-italian-embeddings`](https://huggingface.co/GroNLP/gpt2-small-italian-embeddings): Small model size with only retrained lexical embeddings.
 - [`gpt2-small-italian`](https://huggingface.co/GroNLP/gpt2-small-italian):  Small model size with retrained lexical embeddings and additional fine-tuning of the full model. (**Recommended**)
 - [`gpt2-medium-italian-embeddings`](https://huggingface.co/GroNLP/gpt2-medium-italian-embeddings): Medium model size with only retrained lexical embeddings.
 ## How to use
 ```python
 from transformers import pipeline
 pipe = pipeline("text-generation", model="GroNLP/gpt2-small-italian-embeddings")
 ```
 ```python
 from transformers import AutoTokenizer, AutoModel, TFAutoModel
 tokenizer = AutoTokenizer.from_pretrained("GroNLP/gpt2-small-italian-embeddings")
 model = AutoModel.from_pretrained("GroNLP/gpt2-small-italian-embeddings")  # PyTorch
 model = TFAutoModel.from_pretrained("GroNLP/gpt2-small-italian-embeddings")  # Tensorflow
 ```
 ## BibTeX entry
 ```bibtex
@misc{devries2020good,
      title={As good as new. How to successfully recycle English GPT-2 to make models for other languages}, 
      author={Wietse de Vries and Malvina Nissim},
      year={2020},
      eprint={2012.05628},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
 }
 ```
--- a/config.json
+++ b/config.json
@@ -0,0 +1,42 @@
 {
  "_name_or_path": "data/hf/gpt2-small-italian-embeddings",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 0,
  "embd_pdrop": 0.1,
  "eos_token_id": 0,
  "gradient_checkpointing": false,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "pad_token_id": 0,
  "resid_pdrop": 0.1,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 100,
      "no_repeat_ngram_size": 4,
      "num_beams": 10,
      "repetition_penalty": 10.0,
      "temperature": 2.0,
      "top_k": 20,
      "top_p": 0.9
    }
  },
  "use_cache": true,
  "vocab_size": 30001
 }
--- a/flax_model.msgpack
+++ b/flax_model.msgpack
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c69194a953af5d91a1cb9b4817150e3e2835a167bf5575369ed61fdec520b90e
 size 435537688
--- a/merges.txt
+++ b/merges.txt
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:205ed396f12b24f95d97eb71d802666ee84bdcb47073fc02e59af4e0a1a93deb
 size 448133166
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:6f4f0566cb237a50d9a704ae1ddbe75390831d2afdcbb2745bcc41f3f933d5aa
 size 448177399
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1 @@
 {"bos_token": "<|endoftext|>", "eos_token": "<|endoftext|>", "unk_token": "<|endoftext|>"}
--- a/tf_model.h5
+++ b/tf_model.h5
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:f44249fa6809cd92522d0bea75373939f7c06bce5d4d606caa3a2e6374df7382
 size 435706328
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,6 @@
 {
    "unk_token": "<|endoftext|>",
    "bos_token": "<|endoftext|>",
    "eos_token": "<|endoftext|>",
    "add_prefix_space": false
 }
--- a/vocab.json
+++ b/vocab.json
		`@@ -0,0 +1 @@`
							`{"bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "unk_token": "<\|endoftext\|>"}`