初始化项目,由ModelHub XC社区提供模型

Model: KSP-NMAI/boris-125M-superlight-cubscout
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-11 03:29:17 +08:00
commit 7f7533393a
9 changed files with 250460 additions and 0 deletions

37
.gitattributes vendored Normal file
View File

@@ -0,0 +1,37 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
boris-125M.png filter=lfs diff=lfs merge=lfs -text
boris-superlight-cubscout-125M-F16.gguf filter=lfs diff=lfs merge=lfs -text

54
README.md Normal file
View File

@@ -0,0 +1,54 @@
---
license: apache-2.0
datasets:
- roneneldan/TinyStories
language:
- en
library_name: transformers
tags:
- base-model
---
![boris](boris-125M.png)
# boris-125M-superlight-cubscout
boris-125M-superlight-cubscout (Boris) is a lightweight, ~125M parameter text generation model trained entirely on the roneneldan/TinyStories dataset.
It was developed entirely on one NVIDIA RTX 3060 in ~2.5 days. Boris's primary use case is generating bad children's short stories.
---
## Traning Details:
- Trained on TinyStories (43,395 steps)
- Trained using one NVIDIA RTX 3060 (12GB VRAM)
- Precision: FP16
- Final Traning Loss: ~1.66
---
## Advice:
2. This is a **base model**, and does not know how to stop. Add stop sequences like "the end." or ###
---
## Evaluation Results:
**Final Training Loss: ~1.66** TinyStories (Train)
---
## Copyright & License:
*Copyright 2026 Joseph Jones*
This project and all associated files (the "Work") are licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

3
boris-125M.png Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:66eca5a165acb9e58252a34ceef0f6e7d34d2ac1c29b574a5d21b4be6d470193
size 286159

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:56f1daaf820b71e8ed48f9e19a5749b62128144c3575befa1021b97466335491
size 248987904

32
config.json Normal file
View File

@@ -0,0 +1,32 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"dtype": "float32",
"eos_token_id": 2,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 2048,
"max_position_embeddings": 1024,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"num_key_value_heads": 12,
"pad_token_id": null,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_parameters": {
"rope_theta": 10000.0,
"rope_type": "default"
},
"tie_word_embeddings": true,
"transformers_version": "5.6.2",
"use_cache": true,
"vocab_size": 50304
}

9
generation_config.json Normal file
View File

@@ -0,0 +1,9 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"output_attentions": false,
"output_hidden_states": false,
"transformers_version": "5.6.2",
"use_cache": true
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9b0efc64395303c19136f178dcd18ce4eab6338eee89cc0b7da3fb93f5772427
size 494361544

250306
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

13
tokenizer_config.json Normal file
View File

@@ -0,0 +1,13 @@
{
"add_prefix_space": false,
"backend": "tokenizers",
"bos_token": "<|endoftext|>",
"eos_token": "<|endoftext|>",
"errors": "replace",
"is_local": false,
"local_files_only": false,
"model_max_length": 1024,
"pad_token": "<|endoftext|>",
"tokenizer_class": "GPT2Tokenizer",
"unk_token": "<|endoftext|>"
}