2ae0f79900a587c224676eb0879916ef28cd26f1
Model: AICrossSim/clm-200m Source: Original Platform
library_name, tags, license, datasets, language
| library_name | tags | license | datasets | language | |||
|---|---|---|---|---|---|---|---|
| transformers |
|
odc-by |
|
|
Model Card for AICrossSim/clm-200m
A 200M parameter language model trained on 22 * 200M tokens from FineWeb-Edu dataset.
Model Details
aixsim-200M is a transformer-based language model with approximately 200 million parameters (embedding layer params excluded). It uses RMSNorm for normalization and is trained on the FineWeb-Edu dataset.
- Developed by: AICrossSim
- Funded by: ARIA
- Model type: Transformer Language Model
- Language(s) (NLP): English
- Tokenizer: HuggingFaceTB/cosmo2-tokenizer
- Repository: AICrossSim/NewComputeBench
Training Details
Experiment setup and training logs can be found at wandb run.
Usage
import transformers
model_name="AICrossSim/clm-200m"
model = transformers.AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
lm-evaluation-harness
| Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
|---|---|---|---|---|---|---|---|---|
| wikitext | 2 | none | 0 | bits_per_byte | ↓ | 1.0994 | ± | N/A |
| none | 0 | byte_perplexity | ↓ | 2.1427 | ± | N/A | ||
| none | 0 | word_perplexity | ↓ | 58.8531 | ± | N/A |
Description
Languages
Text
100%