初始化项目,由ModelHub XC社区提供模型

Model: papahawk/gpt2-1.5b
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-08 18:19:19 +08:00
commit 75863666b8
25 changed files with 200547 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

3
64-8bits.tflite Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c966da3b74697803352ca7c6f2f220e7090a557b619de9da0c6b34d89f7825c1
size 125162496

3
64-fp16.tflite Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1ceafd82e733dd4b21570b2a86cf27556a983041806c033a55d086e0ed782cd3
size 248269688

3
64.tflite Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cfcd510b239d90b71ee87d4e57a5a8c2d55b2a941e5d9fe5852298268ddbe61b
size 495791932

74
README.md Normal file
View File

@@ -0,0 +1,74 @@
---
language:
- en
tags:
- text-generation
- pyTtorch
- tensorflow
- transformers
datasets:
- gpt-2-output-dataset
license: mit
---
<h1 style='text-align: center '>GPT2-1.5b LLM</h1>
<h2 style='text-align: center '><em>Fork of OpenAI/GPT2-1.5b</em> </h2>
<h3 style='text-align: center '>Model Card</h3>
<img src="https://alt-web.xyz/images/rainbow.png" alt="Rainbow Solutions" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
# gpt2-1.5b
Code and models from the paper ["Language Models are Unsupervised Multitask Learners"](https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf).
You can read about GPT-2 and its staged release in our [original blog post](https://blog.openai.com/better-language-models/), [6 month follow-up post](https://openai.com/blog/gpt-2-6-month-follow-up/), and [final post](https://www.openai.com/blog/gpt-2-1-5b-release/).
We have also [released a dataset](https://github.com/openai/gpt-2-output-dataset) for researchers to study their behaviors.
<sup>*</sup> *Note that our original parameter counts were wrong due to an error (in our previous blog posts and paper). Thus you may have seen small referred to as 117M and medium referred to as 345M.*
## Usage
This repository is meant to be a starting point for researchers and engineers to experiment with GPT-2.
For basic information, see our [model card](./model_card.md).
### Some caveats
- GPT-2 models' robustness and worst case behaviors are not well-understood. As with any machine-learned model, carefully evaluate GPT-2 for your use case, especially if used without fine-tuning or in safety-critical applications where reliability is important.
- The dataset our GPT-2 models were trained on contains many texts with [biases](https://twitter.com/TomerUllman/status/1101485289720242177) and factual inaccuracies, and thus GPT-2 models are likely to be biased and inaccurate as well.
- To avoid having samples mistaken as human-written, we recommend clearly labeling samples as synthetic before wide dissemination. Our models are often incoherent or inaccurate in subtle ways, which takes more than a quick read for a human to notice.
### Work with us
Please [let us know](mailto:languagequestions@openai.com) if youre doing interesting research with or working on applications of GPT-2! Were especially interested in hearing from and potentially working with those who are studying
- Potential malicious use cases and defenses against them (e.g. the detectability of synthetic text)
- The extent of problematic content (e.g. bias) being baked into the models and effective mitigations
## Development
See [DEVELOPERS.md](./DEVELOPERS.md)
## Contributors
See [CONTRIBUTORS.md](./CONTRIBUTORS.md)
## Citation
Please use the following bibtex entry:
```
@article{radford2019language,
title={Language Models are Unsupervised Multitask Learners},
author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},
year={2019}
}
```
## Future work
We may release code for evaluating the models on various benchmarks.
We are still considering release of the larger models.
## License
[Modified MIT](./LICENSE)

31
config.json Normal file
View File

@@ -0,0 +1,31 @@
{
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 768,
"n_head": 12,
"n_layer": 12,
"n_positions": 1024,
"resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"vocab_size": 50257
}

3
flax_model.msgpack Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:192e8257ae9e8f796f764630f4a488a6a16d1461762d62b49ef7405df951a283
size 497764120

6
generation_config.json Normal file
View File

@@ -0,0 +1,6 @@
{
"bos_token_id": 50256,
"eos_token_id": 50256,
"transformers_version": "4.26.0.dev0",
"_from_model_config": true
}

50001
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:248dfc3911869ec493c76e65bf2fcf7f615828b0254c12b473182f0f81d3a707
size 548105171

38
onnx/config.json Normal file
View File

@@ -0,0 +1,38 @@
{
"_name_or_path": "gpt2",
"activation_function": "gelu_new",
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 768,
"n_head": 12,
"n_inner": null,
"n_layer": 12,
"n_positions": 1024,
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
}
},
"transformers_version": "4.30.2",
"use_cache": true,
"vocab_size": 50257
}

3
onnx/decoder_model.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e3fc9615868ff8f5e0429b892a0f6ca692784ba6c4ca31c4e9ee8218e7cce34f
size 653665842

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e6fc046fe5a7cfeeb8bb3c7d4c1b6a8bd90ced7339a75d5567957e3bc9d48abe
size 655189339

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:570d958241de81f12d82a8358dfc0b408a7bf44ff2bd10ac4a97dab24a8118db
size 653672649

View File

@@ -0,0 +1,6 @@
{
"_from_model_config": true,
"bos_token_id": 50256,
"eos_token_id": 50256,
"transformers_version": "4.30.2"
}

50001
onnx/merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,5 @@
{
"bos_token": "<|endoftext|>",
"eos_token": "<|endoftext|>",
"unk_token": "<|endoftext|>"
}

100305
onnx/tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,9 @@
{
"add_prefix_space": false,
"bos_token": "<|endoftext|>",
"clean_up_tokenization_spaces": true,
"eos_token": "<|endoftext|>",
"model_max_length": 1024,
"tokenizer_class": "GPT2Tokenizer",
"unk_token": "<|endoftext|>"
}

1
onnx/vocab.json Normal file

File diff suppressed because one or more lines are too long

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7c5d3f4b8b76583b422fcb9189ad6c89d5d97a094541ce8932dce3ecabde1421
size 548118077

3
rust_model.ot Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:adf0adedbf4016b249550f866c66a3b3a3d09c8b3b3a1f6e5e9a265d94e0270e
size 702517648

3
tf_model.h5 Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d08c1307f7dfae6f878e0a2ca5715d587d2640530db8ef96fc0c1fc474dd9fee
size 497933648

1
tokenizer.json Normal file

File diff suppressed because one or more lines are too long

1
vocab.json Normal file

File diff suppressed because one or more lines are too long