--- base_model: meta-llama/Llama-2-7b-hf datasets: - cerebras/SlimPajama-627B --- Checkpoint of a [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b-hf) model that has had 50% of the parameters pruned in one-shot with SparseGPT, then retrained for 40B tokens from SlimPajama while maintaining sparsity. * Model: Llama 2 * Size: 7B * LR: 3.00E-4 * Dataset: SlimPajama * Retrained tokens: 40B * Notes: no warmup + decay to 0.0 * Eval Harness: * CommonSense Reasoning: 62.2 (97.65%) * Reading Comprehension: 57.7 (98.30%) * World Knowledge: 42.4 (97.65%) * Math: 6.1 (74.39%) * Code: 16.2 (98.78%)