初始化项目,由ModelHub XC社区提供模型

Model: fpadovani/eng_100mb_baseline_seed3407
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-30 03:10:20 +08:00
commit d1cda2f931
64 changed files with 1440439 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

58
README.md Normal file
View File

@@ -0,0 +1,58 @@
---
base_model: goldfish-models/eng_latn_100mb
library_name: transformers
model_name: eng_100mb_baseline
tags:
- generated_from_trainer
- trl
- sft
licence: license
---
# Model Card for eng_100mb_baseline
This model is a fine-tuned version of [goldfish-models/eng_latn_100mb](https://huggingface.co/goldfish-models/eng_latn_100mb).
It has been trained using [TRL](https://github.com/huggingface/trl).
## Quick start
```python
from transformers import pipeline
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="fpadovani/eng_100mb_baseline", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
```
## Training procedure
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/f-padovani-university-of-groningen/formal_lang/runs/mbxwi6ow)
This model was trained with SFT.
### Framework versions
- TRL: 0.13.0
- Transformers: 4.47.0
- Pytorch: 2.11.0
- Datasets: 4.8.5
- Tokenizers: 0.21.0
## Citations
Cite TRL as:
```bibtex
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
```

View File

@@ -0,0 +1,35 @@
{
"_name_or_path": "goldfish-models/eng_latn_100mb",
"activation_function": "gelu",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50000,
"embd_pdrop": 0.1,
"eos_token_id": 50001,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 2048,
"n_embd": 768,
"n_head": 12,
"n_inner": 3072,
"n_layer": 12,
"n_positions": 2048,
"pad_token_id": 50002,
"prefix": "[CLS]",
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.47.0",
"use_cache": true,
"vocab_size": 51200
}

View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 50000,
"eos_token_id": 50001,
"pad_token_id": 50002,
"transformers_version": "4.47.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a7a3543d288b1a5b4e53d1aed51ee60fbc62dc953d89784bd0b7e1f2500fab7b
size 251915968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4623cafd769cfd599c58ffa3af0c5f8f4f4af3a2cfaf35af5ad68c15998ac26c
size 503926155

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5670878ca0841cf4c914948376ece4089661c79880b007c41a49182fb1b4daae
size 14645

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:882d5add0327723eaf4c33da6f344dcad0c61a201d5b27fd99d702cf3d96df5e
size 1465

File diff suppressed because it is too large Load Diff

210940
checkpoint-10000/tokenizer.json Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d158a5b32cad30637a7c568c36ecb1696be17e1bd944853e45e7b88828bc9f6
size 6097

View File

@@ -0,0 +1,35 @@
{
"_name_or_path": "goldfish-models/eng_latn_100mb",
"activation_function": "gelu",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50000,
"embd_pdrop": 0.1,
"eos_token_id": 50001,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 2048,
"n_embd": 768,
"n_head": 12,
"n_inner": 3072,
"n_layer": 12,
"n_positions": 2048,
"pad_token_id": 50002,
"prefix": "[CLS]",
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.47.0",
"use_cache": true,
"vocab_size": 51200
}

View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 50000,
"eos_token_id": 50001,
"pad_token_id": 50002,
"transformers_version": "4.47.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e86c63242f0b58bcfc6716b4de36c989489c55bf0fffee946ce953ad0ea7e172
size 251915968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9bb89e49397860dadb5b8c4dac94d64a4e546dc5124079451d63a5cf40a006f5
size 503926155

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:228bfeec55ed589bbe794b2c14dc283dca51b817eba95c19971c4df857227ef2
size 14645

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:30575d6e7e956feb166516f8fb8635fad06bbeb16576a63a69bc90526e1553af
size 1465

File diff suppressed because it is too large Load Diff

210940
checkpoint-15000/tokenizer.json Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d158a5b32cad30637a7c568c36ecb1696be17e1bd944853e45e7b88828bc9f6
size 6097

View File

@@ -0,0 +1,35 @@
{
"_name_or_path": "goldfish-models/eng_latn_100mb",
"activation_function": "gelu",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50000,
"embd_pdrop": 0.1,
"eos_token_id": 50001,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 2048,
"n_embd": 768,
"n_head": 12,
"n_inner": 3072,
"n_layer": 12,
"n_positions": 2048,
"pad_token_id": 50002,
"prefix": "[CLS]",
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.47.0",
"use_cache": true,
"vocab_size": 51200
}

View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 50000,
"eos_token_id": 50001,
"pad_token_id": 50002,
"transformers_version": "4.47.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e85c41b4232f991822a871c7cab9ea8f3b3d21d4b27f52ebca7db7a4f99cd0dd
size 251915968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3d831383f3982465fc7800c05baeb691f6532855772b78cb5dfcaf2b7c0f1bca
size 503926155

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4daaedc5aca0815419c3cdca2050c361c2d53e62363a733c235fb39522267c4c
size 14645

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6654d4acd90458a188fd9adc6ca8bb91014a57eb893fdbe0b259fd07bf04c820
size 1465

File diff suppressed because it is too large Load Diff

210940
checkpoint-20000/tokenizer.json Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d158a5b32cad30637a7c568c36ecb1696be17e1bd944853e45e7b88828bc9f6
size 6097

View File

@@ -0,0 +1,35 @@
{
"_name_or_path": "goldfish-models/eng_latn_100mb",
"activation_function": "gelu",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50000,
"embd_pdrop": 0.1,
"eos_token_id": 50001,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 2048,
"n_embd": 768,
"n_head": 12,
"n_inner": 3072,
"n_layer": 12,
"n_positions": 2048,
"pad_token_id": 50002,
"prefix": "[CLS]",
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.47.0",
"use_cache": true,
"vocab_size": 51200
}

View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 50000,
"eos_token_id": 50001,
"pad_token_id": 50002,
"transformers_version": "4.47.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f84f174b6ac21fa9a294afc1ff4e87c092813653d818137cb53cc2dc1dbf1e50
size 251915968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0236f7696939587f24d9d478e5bda075cccb35f029be341ab2ff5fa23f607675
size 503926155

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1fa51f248bf4f1065a93ab339ec9044279fb6479a92e225397503516c1973e5d
size 14645

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6806ae731230be2f6665d97664bb050bb374f7ea046300d8e67c3f7486e91e13
size 1465

File diff suppressed because it is too large Load Diff

210940
checkpoint-22260/tokenizer.json Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d158a5b32cad30637a7c568c36ecb1696be17e1bd944853e45e7b88828bc9f6
size 6097

View File

@@ -0,0 +1,35 @@
{
"_name_or_path": "goldfish-models/eng_latn_100mb",
"activation_function": "gelu",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50000,
"embd_pdrop": 0.1,
"eos_token_id": 50001,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 2048,
"n_embd": 768,
"n_head": 12,
"n_inner": 3072,
"n_layer": 12,
"n_positions": 2048,
"pad_token_id": 50002,
"prefix": "[CLS]",
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.47.0",
"use_cache": true,
"vocab_size": 51200
}

View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 50000,
"eos_token_id": 50001,
"pad_token_id": 50002,
"transformers_version": "4.47.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:de451a915672abb5fe3435a56a77d786aec0a4210aebed1132385c3ea0c940c4
size 251915968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d11352aa36865288c77bbbb2faaf17a6ee39838b0b690308b5e5735321dbdfdf
size 503926155

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bc8d5a080e0638fb927f227532929e1893868ecbc28e22bd123ff9a2c75075bc
size 14645

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9df277ac5eda814225c2ffc720359bed57e8b83a7ddccf1109bf2434b19a7c5d
size 1465

File diff suppressed because it is too large Load Diff

210940
checkpoint-5000/tokenizer.json Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d158a5b32cad30637a7c568c36ecb1696be17e1bd944853e45e7b88828bc9f6
size 6097

35
config.json Normal file
View File

@@ -0,0 +1,35 @@
{
"_name_or_path": "goldfish-models/eng_latn_100mb",
"activation_function": "gelu",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50000,
"embd_pdrop": 0.1,
"eos_token_id": 50001,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 2048,
"n_embd": 768,
"n_head": 12,
"n_inner": 3072,
"n_layer": 12,
"n_positions": 2048,
"pad_token_id": 50002,
"prefix": "[CLS]",
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.47.0",
"use_cache": true,
"vocab_size": 51200
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 50000,
"eos_token_id": 50001,
"pad_token_id": 50002,
"transformers_version": "4.47.0"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f84f174b6ac21fa9a294afc1ff4e87c092813653d818137cb53cc2dc1dbf1e50
size 251915968

1249
special_tokens_map.json Normal file

File diff suppressed because it is too large Load Diff

210940
tokenizer.json Normal file

File diff suppressed because one or more lines are too long

10829
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d158a5b32cad30637a7c568c36ecb1696be17e1bd944853e45e7b88828bc9f6
size 6097