54 lines
1.4 KiB
Markdown
54 lines
1.4 KiB
Markdown
---
|
|
license: apache-2.0
|
|
datasets:
|
|
- roneneldan/TinyStories
|
|
language:
|
|
- en
|
|
library_name: transformers
|
|
tags:
|
|
- base-model
|
|
---
|
|
|
|

|
|
# boris-125M-superlight-cubscout
|
|
|
|
boris-125M-superlight-cubscout (Boris) is a lightweight, ~125M parameter text generation model trained entirely on the roneneldan/TinyStories dataset.
|
|
|
|
It was developed entirely on one NVIDIA RTX 3060 in ~2.5 days. Boris's primary use case is generating bad children's short stories.
|
|
|
|
---
|
|
|
|
## Traning Details:
|
|
|
|
- Trained on TinyStories (43,395 steps)
|
|
- Trained using one NVIDIA RTX 3060 (12GB VRAM)
|
|
- Precision: FP16
|
|
- Final Traning Loss: ~1.66
|
|
|
|
---
|
|
|
|
## Advice:
|
|
|
|
2. This is a **base model**, and does not know how to stop. Add stop sequences like "the end." or ###
|
|
|
|
---
|
|
|
|
## Evaluation Results:
|
|
|
|
**Final Training Loss: ~1.66** TinyStories (Train)
|
|
|
|
---
|
|
|
|
## Copyright & License:
|
|
|
|
*Copyright 2026 Joseph Jones*
|
|
|
|
This project and all associated files (the "Work") are licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at:
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License. |