19 lines
614 B
Markdown
19 lines
614 B
Markdown
---
|
|
base_model: meta-llama/Llama-2-7b-hf
|
|
datasets:
|
|
- cerebras/SlimPajama-627B
|
|
---
|
|
Checkpoint of a [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b-hf) model that has had 50% of the parameters pruned in one-shot with SparseGPT, then retrained for 40B tokens from SlimPajama while maintaining sparsity.
|
|
|
|
* Model: Llama 2
|
|
* Size: 7B
|
|
* LR: 3.00E-4
|
|
* Dataset: SlimPajama
|
|
* Retrained tokens: 40B
|
|
* Notes: no warmup + decay to 0.0
|
|
* Eval Harness:
|
|
* CommonSense Reasoning: 62.2 (97.65%)
|
|
* Reading Comprehension: 57.7 (98.30%)
|
|
* World Knowledge: 42.4 (97.65%)
|
|
* Math: 6.1 (74.39%)
|
|
* Code: 16.2 (98.78%) |