Llama-2-7b-pruned40-retrained/README.md

---
base_model: meta-llama/Llama-2-7b-hf
datasets:
  - cerebras/SlimPajama-627B
---
Checkpoint of a [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b-hf) model that has had 50% of the parameters pruned in one-shot with SparseGPT, then retrained for 40B tokens from SlimPajama while maintaining sparsity.

* Model: Llama 2
* Size: 7B
* LR: 3.00E-4
* Dataset: SlimPajama
* Retrained tokens: 40B
* Notes: no warmup + decay to 0.0
* Eval Harness:
  * CommonSense Reasoning: 62.2 (97.65%)
  * Reading Comprehension: 57.7 (98.30%)
  * World Knowledge: 42.4 (97.65%)
  * Math: 6.1 (74.39%)
  * Code: 16.2 (98.78%)