初始化项目,由ModelHub XC社区提供模型
Model: AICrossSim/clm-200m Source: Original Platform
This commit is contained in:
48
README.md
Normal file
48
README.md
Normal file
@@ -0,0 +1,48 @@
|
||||
---
|
||||
library_name: transformers
|
||||
tags:
|
||||
- language-model
|
||||
license: odc-by
|
||||
datasets:
|
||||
- HuggingFaceFW/fineweb-edu
|
||||
language:
|
||||
- en
|
||||
---
|
||||
|
||||
# Model Card for AICrossSim/clm-200m
|
||||
|
||||
A 200M parameter language model trained on `22 * 200M` tokens from FineWeb-Edu dataset.
|
||||
|
||||
## Model Details
|
||||
|
||||
aixsim-200M is a transformer-based language model with approximately 200 million parameters (embedding layer params excluded).
|
||||
It uses RMSNorm for normalization and is trained on the FineWeb-Edu dataset.
|
||||
|
||||
- **Developed by:** AICrossSim
|
||||
- **Funded by:** [ARIA](https://www.aria.org.uk/)
|
||||
- **Model type:** Transformer Language Model
|
||||
- **Language(s) (NLP):** English
|
||||
- **Tokenizer:** [HuggingFaceTB/cosmo2-tokenizer](https://huggingface.co/HuggingFaceTB/cosmo2-tokenizer)
|
||||
- **Repository:** [AICrossSim/NewComputeBench](https://github.com/AICrossSim/NewComputeBench)
|
||||
|
||||
## Training Details
|
||||
|
||||
Experiment setup and training logs can be found at [wandb run](https://wandb.ai/cz98/torchtitan/runs/uhnlw6k8?nw=nwusercz98).
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
import transformers
|
||||
|
||||
model_name="AICrossSim/clm-200m"
|
||||
model = transformers.AutoModelForCausalLM.from_pretrained(model_name)
|
||||
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
|
||||
```
|
||||
|
||||
## lm-evaluation-harness
|
||||
|
||||
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|
||||
|--------|------:|------|-----:|---------------|---|------:|---|------|
|
||||
|wikitext| 2|none | 0|bits_per_byte |↓ | 1.0994|± | N/A|
|
||||
| | |none | 0|byte_perplexity|↓ | 2.1427|± | N/A|
|
||||
| | |none | 0|word_perplexity|↓ |58.8531|± | N/A|
|
||||
Reference in New Issue
Block a user