初始化项目,由ModelHub XC社区提供模型

Model: KoboldAI/GPT-J-6B-Janeway
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-29 21:19:12 +08:00
commit 7f97564d0a
11 changed files with 50142 additions and 0 deletions

49
.gitattributes vendored Normal file
View File

@@ -0,0 +1,49 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

42
README.md Normal file
View File

@@ -0,0 +1,42 @@
---
language: en
license: mit
---
# GPT-J 6B - Janeway
## Model Description
GPT-J 6B-Janeway is a finetune created using EleutherAI's GPT-J 6B model.
## Training data
The training data contains around 2210 ebooks, mostly in the sci-fi and fantasy genres. The dataset is based on the same dataset used by GPT-Neo-2.7B-Picard, with 20% more data in various genres.
Some parts of the dataset have been prepended using the following text: `[Genre: <genre1>,<genre2>]`
### How to use
You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:
```py
>>> from transformers import pipeline
>>> generator = pipeline('text-generation', model='KoboldAI/GPT-J-6B-Janeway')
>>> generator("Welcome Captain Janeway, I apologize for the delay.", do_sample=True, min_length=50)
[{'generated_text': 'Welcome Captain Janeway, I apologize for the delay."\nIt's all right," Janeway said. "I'm certain that you're doing your best to keep me informed of what\'s going on."'}]
```
### Limitations and Biases
The core functionality of GPT-J is taking a string of text and predicting the next token. While language models are widely used for tasks other than this, there are a lot of unknowns with this work. When prompting GPT-J it is important to remember that the statistically most likely next token is often not the token that produces the most "accurate" text. Never depend upon GPT-J to produce factually accurate output.
GPT-J was trained on the Pile, a dataset known to contain profanity, lewd, and otherwise abrasive language. Depending upon use case GPT-J may produce socially unacceptable text. See [Sections 5 and 6 of the Pile paper](https://arxiv.org/abs/2101.00027) for a more detailed analysis of the biases in the Pile.
As with all language models, it is hard to predict in advance how GPT-J will respond to particular prompts and offensive content may occur without warning. We recommend having a human curate or filter the outputs before releasing them, both to censor undesirable content and to improve the quality of the results.
### BibTeX entry and citation info
The model uses the following model as base:
```bibtex
@misc{gpt-j,
author = {Wang, Ben and Komatsuzaki, Aran},
title = {{GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model}},
howpublished = {\url{https://github.com/kingoflolz/mesh-transformer-jax}},
year = 2021,
month = May
}
```
## Acknowledgements
This project would not have been possible without compute generously provided by Google through the
[TPU Research Cloud](https://sites.research.google/trc/), as well as the Cloud TPU team for providing early access to the [Cloud TPU VM](https://cloud.google.com/blog/products/compute/introducing-cloud-tpu-vms) Alpha.

1
added_tokens.json Normal file
View File

@@ -0,0 +1 @@
{"<|extratoken_14|>": 50270, "<|extratoken_121|>": 50377, "<|extratoken_3|>": 50259, "<|extratoken_25|>": 50281, "<|extratoken_101|>": 50357, "<|extratoken_138|>": 50394, "<|extratoken_10|>": 50266, "<|extratoken_21|>": 50277, "<|extratoken_32|>": 50288, "<|extratoken_46|>": 50302, "<|extratoken_22|>": 50278, "<|extratoken_40|>": 50296, "<|extratoken_96|>": 50352, "<|extratoken_92|>": 50348, "<|extratoken_95|>": 50351, "<|extratoken_141|>": 50397, "<|extratoken_78|>": 50334, "<|extratoken_86|>": 50342, "<|extratoken_56|>": 50312, "<|extratoken_124|>": 50380, "<|extratoken_127|>": 50383, "<|extratoken_122|>": 50378, "<|extratoken_123|>": 50379, "<|extratoken_111|>": 50367, "<|extratoken_93|>": 50349, "<|extratoken_130|>": 50386, "<|extratoken_113|>": 50369, "<|extratoken_50|>": 50306, "<|extratoken_97|>": 50353, "<|extratoken_1|>": 50257, "<|extratoken_55|>": 50311, "<|extratoken_34|>": 50290, "<|extratoken_143|>": 50399, "<|extratoken_62|>": 50318, "<|extratoken_74|>": 50330, "<|extratoken_136|>": 50392, "<|extratoken_117|>": 50373, "<|extratoken_38|>": 50294, "<|extratoken_120|>": 50376, "<|extratoken_39|>": 50295, "<|extratoken_65|>": 50321, "<|extratoken_29|>": 50285, "<|extratoken_104|>": 50360, "<|extratoken_13|>": 50269, "<|extratoken_5|>": 50261, "<|extratoken_107|>": 50363, "<|extratoken_19|>": 50275, "<|extratoken_84|>": 50340, "<|extratoken_77|>": 50333, "<|extratoken_135|>": 50391, "<|extratoken_24|>": 50280, "<|extratoken_134|>": 50390, "<|extratoken_15|>": 50271, "<|extratoken_67|>": 50323, "<|extratoken_89|>": 50345, "<|extratoken_2|>": 50258, "<|extratoken_73|>": 50329, "<|extratoken_129|>": 50385, "<|extratoken_126|>": 50382, "<|extratoken_30|>": 50286, "<|extratoken_41|>": 50297, "<|extratoken_28|>": 50284, "<|extratoken_114|>": 50370, "<|extratoken_128|>": 50384, "<|extratoken_118|>": 50374, "<|extratoken_131|>": 50387, "<|extratoken_68|>": 50324, "<|extratoken_125|>": 50381, "<|extratoken_103|>": 50359, "<|extratoken_8|>": 50264, "<|extratoken_64|>": 50320, "<|extratoken_52|>": 50308, "<|extratoken_45|>": 50301, "<|extratoken_43|>": 50299, "<|extratoken_18|>": 50274, "<|extratoken_139|>": 50395, "<|extratoken_85|>": 50341, "<|extratoken_88|>": 50344, "<|extratoken_63|>": 50319, "<|extratoken_4|>": 50260, "<|extratoken_48|>": 50304, "<|extratoken_112|>": 50368, "<|extratoken_17|>": 50273, "<|extratoken_49|>": 50305, "<|extratoken_108|>": 50364, "<|extratoken_110|>": 50366, "<|extratoken_42|>": 50298, "<|extratoken_70|>": 50326, "<|extratoken_6|>": 50262, "<|extratoken_35|>": 50291, "<|extratoken_23|>": 50279, "<|extratoken_66|>": 50322, "<|extratoken_60|>": 50316, "<|extratoken_71|>": 50327, "<|extratoken_51|>": 50307, "<|extratoken_133|>": 50389, "<|extratoken_20|>": 50276, "<|extratoken_76|>": 50332, "<|extratoken_81|>": 50337, "<|extratoken_142|>": 50398, "<|extratoken_116|>": 50372, "<|extratoken_57|>": 50313, "<|extratoken_75|>": 50331, "<|extratoken_37|>": 50293, "<|extratoken_33|>": 50289, "<|extratoken_16|>": 50272, "<|extratoken_61|>": 50317, "<|extratoken_7|>": 50263, "<|extratoken_12|>": 50268, "<|extratoken_36|>": 50292, "<|extratoken_80|>": 50336, "<|extratoken_98|>": 50354, "<|extratoken_105|>": 50361, "<|extratoken_91|>": 50347, "<|extratoken_53|>": 50309, "<|extratoken_137|>": 50393, "<|extratoken_9|>": 50265, "<|extratoken_79|>": 50335, "<|extratoken_83|>": 50339, "<|extratoken_109|>": 50365, "<|extratoken_99|>": 50355, "<|extratoken_140|>": 50396, "<|extratoken_72|>": 50328, "<|extratoken_11|>": 50267, "<|extratoken_94|>": 50350, "<|extratoken_26|>": 50282, "<|extratoken_59|>": 50315, "<|extratoken_106|>": 50362, "<|extratoken_115|>": 50371, "<|extratoken_58|>": 50314, "<|extratoken_90|>": 50346, "<|extratoken_31|>": 50287, "<|extratoken_102|>": 50358, "<|extratoken_47|>": 50303, "<|extratoken_100|>": 50356, "<|extratoken_82|>": 50338, "<|extratoken_44|>": 50300, "<|extratoken_69|>": 50325, "<|extratoken_54|>": 50310, "<|extratoken_132|>": 50388, "<|extratoken_27|>": 50283, "<|extratoken_87|>": 50343, "<|extratoken_119|>": 50375}

39
config.json Normal file
View File

@@ -0,0 +1,39 @@
{
"activation_function": "gelu_new",
"architectures": [
"GPTJForCausalLM"
],
"attn_pdrop": 0.0,
"bos_token_id": 50256,
"embd_pdrop": 0.0,
"eos_token_id": 50256,
"gradient_checkpointing": false,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gptj",
"n_embd": 4096,
"n_head": 16,
"n_layer": 28,
"n_positions": 2048,
"rotary_dim": 64,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"transformers_version": "4.10.0.dev0",
"tokenizer_class": "GPT2Tokenizer",
"task_specific_params": {
"text-generation": {
"do_sample": true,
"temperature": 1.0,
"max_length": 50
}
},
"torch_dtype": "float16",
"use_cache": true,
"vocab_size": 50400,
"welcome": "You are currently running novel-writing model `Janeway, version 1.`\n\n This model is made by [Mr. Seeker](https://www.patreon.com/mrseeker)\n\n### How to use this model\n\nJaneway is designed to generate stories and novels. Use the authors note to give it a certain genre to follow, use memory to give an overview of the story and use World Information to give it specific details about the characters. To start off, give the AI an idea of what you are writing about by setting the scene. Give the AI around 10 sentences that make your story really interesting to read. Introduce your character, describe the world, blow something up, or let the AI use its creative mind.",
"antemplate": "[Genre: <|>]"
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

50001
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f20896b54fd8aec9cce168f2990557b6c0b3b0d79ccaecbb9ad5138c718b96b7
size 12106053103

1
special_tokens_map.json Normal file
View File

@@ -0,0 +1 @@
{"bos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "eos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "unk_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f758fc1ee05a0775387ad8fa3983fab0c40e4a05cf6b7aea00a29e696cd6205c
size 1373465

1
tokenizer_config.json Normal file
View File

@@ -0,0 +1 @@
{"unk_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "bos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "eos_token": {"content": "<|endoftext|>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "add_prefix_space": false, "errors": "replace", "model_max_length": 2048, "special_tokens_map_file": null, "name_or_path": "gpt-j-6B", "from_slow": true, "tokenizer_class": "GPT2Tokenizer"}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long