license, datasets, language, library_name, tags
license datasets language library_name tags
apache-2.0
roneneldan/TinyStories
en
transformers
base-model

boris

boris-125M-superlight-cubscout

boris-125M-superlight-cubscout (Boris) is a lightweight, ~125M parameter text generation model trained entirely on the roneneldan/TinyStories dataset.

It was developed entirely on one NVIDIA RTX 3060 in ~2.5 days. Boris's primary use case is generating bad children's short stories.


Traning Details:

  • Trained on TinyStories (43,395 steps)
  • Trained using one NVIDIA RTX 3060 (12GB VRAM)
  • Precision: FP16
  • Final Traning Loss: ~1.66

Advice:

  1. This is a base model, and does not know how to stop. Add stop sequences like "the end." or ###

Evaluation Results:

Final Training Loss: ~1.66 TinyStories (Train)


Copyright 2026 Joseph Jones

This project and all associated files (the "Work") are licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at:

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Description
Model synced from source: KSP-NMAI/boris-125M-superlight-cubscout
Readme 767 KiB