初始化项目,由ModelHub XC社区提供模型
Model: KoboldAI/GPT-J-6B-Janeway Source: Original Platform
This commit is contained in:
49
.gitattributes
vendored
Normal file
49
.gitattributes
vendored
Normal file
@@ -0,0 +1,49 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zstandard filter=lfs diff=lfs merge=lfs -text
|
||||
*.tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
*.db* filter=lfs diff=lfs merge=lfs -text
|
||||
*.ark* filter=lfs diff=lfs merge=lfs -text
|
||||
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
|
||||
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
|
||||
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.gguf* filter=lfs diff=lfs merge=lfs -text
|
||||
*.ggml filter=lfs diff=lfs merge=lfs -text
|
||||
*.llamafile* filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
42
README.md
Normal file
42
README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
language: en
|
||||
license: mit
|
||||
---
|
||||
# GPT-J 6B - Janeway
|
||||
## Model Description
|
||||
GPT-J 6B-Janeway is a finetune created using EleutherAI's GPT-J 6B model.
|
||||
## Training data
|
||||
The training data contains around 2210 ebooks, mostly in the sci-fi and fantasy genres. The dataset is based on the same dataset used by GPT-Neo-2.7B-Picard, with 20% more data in various genres.
|
||||
Some parts of the dataset have been prepended using the following text: `[Genre: <genre1>,<genre2>]`
|
||||
### How to use
|
||||
You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:
|
||||
```py
|
||||
>>> from transformers import pipeline
|
||||
>>> generator = pipeline('text-generation', model='KoboldAI/GPT-J-6B-Janeway')
|
||||
>>> generator("Welcome Captain Janeway, I apologize for the delay.", do_sample=True, min_length=50)
|
||||
[{'generated_text': 'Welcome Captain Janeway, I apologize for the delay."\nIt's all right," Janeway said. "I'm certain that you're doing your best to keep me informed of what\'s going on."'}]
|
||||
```
|
||||
### Limitations and Biases
|
||||
|
||||
The core functionality of GPT-J is taking a string of text and predicting the next token. While language models are widely used for tasks other than this, there are a lot of unknowns with this work. When prompting GPT-J it is important to remember that the statistically most likely next token is often not the token that produces the most "accurate" text. Never depend upon GPT-J to produce factually accurate output.
|
||||
|
||||
GPT-J was trained on the Pile, a dataset known to contain profanity, lewd, and otherwise abrasive language. Depending upon use case GPT-J may produce socially unacceptable text. See [Sections 5 and 6 of the Pile paper](https://arxiv.org/abs/2101.00027) for a more detailed analysis of the biases in the Pile.
|
||||
|
||||
As with all language models, it is hard to predict in advance how GPT-J will respond to particular prompts and offensive content may occur without warning. We recommend having a human curate or filter the outputs before releasing them, both to censor undesirable content and to improve the quality of the results.
|
||||
|
||||
### BibTeX entry and citation info
|
||||
The model uses the following model as base:
|
||||
```bibtex
|
||||
@misc{gpt-j,
|
||||
author = {Wang, Ben and Komatsuzaki, Aran},
|
||||
title = {{GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model}},
|
||||
howpublished = {\url{https://github.com/kingoflolz/mesh-transformer-jax}},
|
||||
year = 2021,
|
||||
month = May
|
||||
}
|
||||
```
|
||||
|
||||
## Acknowledgements
|
||||
|
||||
This project would not have been possible without compute generously provided by Google through the
|
||||
[TPU Research Cloud](https://sites.research.google/trc/), as well as the Cloud TPU team for providing early access to the [Cloud TPU VM](https://cloud.google.com/blog/products/compute/introducing-cloud-tpu-vms) Alpha.
|
||||
1
added_tokens.json
Normal file
1
added_tokens.json
Normal file
@@ -0,0 +1 @@
|
||||
{"<|extratoken_14|>": 50270, "<|extratoken_121|>": 50377, "<|extratoken_3|>": 50259, "<|extratoken_25|>": 50281, "<|extratoken_101|>": 50357, "<|extratoken_138|>": 50394, "<|extratoken_10|>": 50266, "<|extratoken_21|>": 50277, "<|extratoken_32|>": 50288, "<|extratoken_46|>": 50302, "<|extratoken_22|>": 50278, "<|extratoken_40|>": 50296, "<|extratoken_96|>": 50352, "<|extratoken_92|>": 50348, "<|extratoken_95|>": 50351, "<|extratoken_141|>": 50397, "<|extratoken_78|>": 50334, "<|extratoken_86|>": 50342, "<|extratoken_56|>": 50312, "<|extratoken_124|>": 50380, "<|extratoken_127|>": 50383, "<|extratoken_122|>": 50378, "<|extratoken_123|>": 50379, "<|extratoken_111|>": 50367, "<|extratoken_93|>": 50349, "<|extratoken_130|>": 50386, "<|extratoken_113|>": 50369, "<|extratoken_50|>": 50306, "<|extratoken_97|>": 50353, "<|extratoken_1|>": 50257, "<|extratoken_55|>": 50311, "<|extratoken_34|>": 50290, "<|extratoken_143|>": 50399, "<|extratoken_62|>": 50318, "<|extratoken_74|>": 50330, "<|extratoken_136|>": 50392, "<|extratoken_117|>": 50373, "<|extratoken_38|>": 50294, "<|extratoken_120|>": 50376, "<|extratoken_39|>": 50295, "<|extratoken_65|>": 50321, "<|extratoken_29|>": 50285, "<|extratoken_104|>": 50360, "<|extratoken_13|>": 50269, "<|extratoken_5|>": 50261, "<|extratoken_107|>": 50363, "<|extratoken_19|>": 50275, "<|extratoken_84|>": 50340, "<|extratoken_77|>": 50333, "<|extratoken_135|>": 50391, "<|extratoken_24|>": 50280, "<|extratoken_134|>": 50390, "<|extratoken_15|>": 50271, "<|extratoken_67|>": 50323, "<|extratoken_89|>": 50345, "<|extratoken_2|>": 50258, "<|extratoken_73|>": 50329, "<|extratoken_129|>": 50385, "<|extratoken_126|>": 50382, "<|extratoken_30|>": 50286, "<|extratoken_41|>": 50297, "<|extratoken_28|>": 50284, "<|extratoken_114|>": 50370, "<|extratoken_128|>": 50384, "<|extratoken_118|>": 50374, "<|extratoken_131|>": 50387, "<|extratoken_68|>": 50324, "<|extratoken_125|>": 50381, "<|extratoken_103|>": 50359, "<|extratoken_8|>": 50264, "<|extratoken_64|>": 50320, "<|extratoken_52|>": 50308, "<|extratoken_45|>": 50301, "<|extratoken_43|>": 50299, "<|extratoken_18|>": 50274, "<|extratoken_139|>": 50395, "<|extratoken_85|>": 50341, "<|extratoken_88|>": 50344, "<|extratoken_63|>": 50319, "<|extratoken_4|>": 50260, "<|extratoken_48|>": 50304, "<|extratoken_112|>": 50368, "<|extratoken_17|>": 50273, "<|extratoken_49|>": 50305, "<|extratoken_108|>": 50364, "<|extratoken_110|>": 50366, "<|extratoken_42|>": 50298, "<|extratoken_70|>": 50326, "<|extratoken_6|>": 50262, "<|extratoken_35|>": 50291, "<|extratoken_23|>": 50279, "<|extratoken_66|>": 50322, "<|extratoken_60|>": 50316, "<|extratoken_71|>": 50327, "<|extratoken_51|>": 50307, "<|extratoken_133|>": 50389, "<|extratoken_20|>": 50276, "<|extratoken_76|>": 50332, "<|extratoken_81|>": 50337, "<|extratoken_142|>": 50398, "<|extratoken_116|>": 50372, "<|extratoken_57|>": 50313, "<|extratoken_75|>": 50331, "<|extratoken_37|>": 50293, "<|extratoken_33|>": 50289, "<|extratoken_16|>": 50272, "<|extratoken_61|>": 50317, "<|extratoken_7|>": 50263, "<|extratoken_12|>": 50268, "<|extratoken_36|>": 50292, "<|extratoken_80|>": 50336, "<|extratoken_98|>": 50354, "<|extratoken_105|>": 50361, "<|extratoken_91|>": 50347, "<|extratoken_53|>": 50309, "<|extratoken_137|>": 50393, "<|extratoken_9|>": 50265, "<|extratoken_79|>": 50335, "<|extratoken_83|>": 50339, "<|extratoken_109|>": 50365, "<|extratoken_99|>": 50355, "<|extratoken_140|>": 50396, "<|extratoken_72|>": 50328, "<|extratoken_11|>": 50267, "<|extratoken_94|>": 50350, "<|extratoken_26|>": 50282, "<|extratoken_59|>": 50315, "<|extratoken_106|>": 50362, "<|extratoken_115|>": 50371, "<|extratoken_58|>": 50314, "<|extratoken_90|>": 50346, "<|extratoken_31|>": 50287, "<|extratoken_102|>": 50358, "<|extratoken_47|>": 50303, "<|extratoken_100|>": 50356, "<|extratoken_82|>": 50338, "<|extratoken_44|>": 50300, "<|extratoken_69|>": 50325, "<|extratoken_54|>": 50310, "<|extratoken_132|>": 50388, "<|extratoken_27|>": 50283, "<|extratoken_87|>": 50343, "<|extratoken_119|>": 50375}
|
||||
39
config.json
Normal file
39
config.json
Normal file
@@ -0,0 +1,39 @@
|
||||
{
|
||||
"activation_function": "gelu_new",
|
||||
"architectures": [
|
||||
"GPTJForCausalLM"
|
||||
],
|
||||
"attn_pdrop": 0.0,
|
||||
"bos_token_id": 50256,
|
||||
"embd_pdrop": 0.0,
|
||||
"eos_token_id": 50256,
|
||||
"gradient_checkpointing": false,
|
||||
"initializer_range": 0.02,
|
||||
"layer_norm_epsilon": 1e-05,
|
||||
"model_type": "gptj",
|
||||
"n_embd": 4096,
|
||||
"n_head": 16,
|
||||
"n_layer": 28,
|
||||
"n_positions": 2048,
|
||||
"rotary_dim": 64,
|
||||
"summary_activation": null,
|
||||
"summary_first_dropout": 0.1,
|
||||
"summary_proj_to_labels": true,
|
||||
"summary_type": "cls_index",
|
||||
"summary_use_proj": true,
|
||||
"transformers_version": "4.10.0.dev0",
|
||||
"tokenizer_class": "GPT2Tokenizer",
|
||||
"task_specific_params": {
|
||||
"text-generation": {
|
||||
"do_sample": true,
|
||||
"temperature": 1.0,
|
||||
"max_length": 50
|
||||
}
|
||||
},
|
||||
"torch_dtype": "float16",
|
||||
"use_cache": true,
|
||||
"vocab_size": 50400,
|
||||
"welcome": "You are currently running novel-writing model `Janeway, version 1.`\n\n This model is made by [Mr. Seeker](https://www.patreon.com/mrseeker)\n\n### How to use this model\n\nJaneway is designed to generate stories and novels. Use the authors note to give it a certain genre to follow, use memory to give an overview of the story and use World Information to give it specific details about the characters. To start off, give the AI an idea of what you are writing about by setting the scene. Give the AI around 10 sentences that make your story really interesting to read. Introduce your character, describe the world, blow something up, or let the AI use its creative mind.",
|
||||
"antemplate": "[Genre: <|>]"
|
||||
}
|
||||
|
||||
1
configuration.json
Normal file
1
configuration.json
Normal file
@@ -0,0 +1 @@
|
||||
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}
|
||||
50001
merges.txt
Normal file
50001
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
pytorch_model.bin
Normal file
3
pytorch_model.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f20896b54fd8aec9cce168f2990557b6c0b3b0d79ccaecbb9ad5138c718b96b7
|
||||
size 12106053103
|
||||
1
special_tokens_map.json
Normal file
1
special_tokens_map.json
Normal file
@@ -0,0 +1 @@
|
||||
{"bos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "eos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "unk_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}}
|
||||
3
tokenizer.json
Normal file
3
tokenizer.json
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f758fc1ee05a0775387ad8fa3983fab0c40e4a05cf6b7aea00a29e696cd6205c
|
||||
size 1373465
|
||||
1
tokenizer_config.json
Normal file
1
tokenizer_config.json
Normal file
@@ -0,0 +1 @@
|
||||
{"unk_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "bos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "eos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "add_prefix_space": false, "errors": "replace", "model_max_length": 2048, "special_tokens_map_file": null, "name_or_path": "gpt-j-6B", "from_slow": true, "tokenizer_class": "GPT2Tokenizer"}
|
||||
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user