初始化项目,由ModelHub XC社区提供模型
Model: Josephgflowers/Tinyllama-1.5B-Cinder-Test-1 Source: Original Platform
This commit is contained in:
8
README.md
Normal file
8
README.md
Normal file
@@ -0,0 +1,8 @@
|
||||
---
|
||||
license: mit
|
||||
---
|
||||
This is a depth up scalled model of the 616M cinder model and Cinder v2. This model still needs further training. Putting it up for testing.
|
||||
More information coming.
|
||||
Maybe. Lol.
|
||||
Here is a brief desc of the project:
|
||||
Im mixing a lot of techniques I guess that I found interesting and have been testing, HF Cosmo is not great but decent and was fully trained in 4 days using a mix of more fine tuned directed datasets and some synthetic textbook style datasets. So I used pruning and a similar mix as Cosmo on tinyllama (trained on a ton of data for an extended time for its size) to keep the tinyllama model coherent during pruning. Now I am trying to depth up scale it using my pruned model and an original, Then taking a majority of each and combining them to create a larger model. Then it needs more training, then fine tuning. Then theoretically it will be a well performing 1.5B model (that didn't need full scale training).
|
||||
Reference in New Issue
Block a user