初始化项目,由ModelHub XC社区提供模型
Model: KSP-NMAI/boris-125M-superlight-cubscout Source: Original Platform
This commit is contained in:
54
README.md
Normal file
54
README.md
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
datasets:
|
||||
- roneneldan/TinyStories
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
tags:
|
||||
- base-model
|
||||
---
|
||||
|
||||

|
||||
# boris-125M-superlight-cubscout
|
||||
|
||||
boris-125M-superlight-cubscout (Boris) is a lightweight, ~125M parameter text generation model trained entirely on the roneneldan/TinyStories dataset.
|
||||
|
||||
It was developed entirely on one NVIDIA RTX 3060 in ~2.5 days. Boris's primary use case is generating bad children's short stories.
|
||||
|
||||
---
|
||||
|
||||
## Traning Details:
|
||||
|
||||
- Trained on TinyStories (43,395 steps)
|
||||
- Trained using one NVIDIA RTX 3060 (12GB VRAM)
|
||||
- Precision: FP16
|
||||
- Final Traning Loss: ~1.66
|
||||
|
||||
---
|
||||
|
||||
## Advice:
|
||||
|
||||
2. This is a **base model**, and does not know how to stop. Add stop sequences like "the end." or ###
|
||||
|
||||
---
|
||||
|
||||
## Evaluation Results:
|
||||
|
||||
**Final Training Loss: ~1.66** TinyStories (Train)
|
||||
|
||||
---
|
||||
|
||||
## Copyright & License:
|
||||
|
||||
*Copyright 2026 Joseph Jones*
|
||||
|
||||
This project and all associated files (the "Work") are licensed under the Apache License, Version 2.0 (the "License"); you may not use this project except in compliance with the License. You may obtain a copy of the License at:
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
Reference in New Issue
Block a user