Update README.md

This commit is contained in:
Pavlo Molchanov
2025-02-14 19:04:05 +00:00
committed by system
parent 70fa5997af
commit e2c30f2d4c

View File

@@ -3,8 +3,15 @@ license: other
license_name: nvidia-open-model-license
license_link: >-
https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
library_name: transformers
pipeline_tag: text-generation
language:
- en
tags:
- nvidia
- llama-3
- pytorch
---
# Model Overview
Minitron-8B-Base is a large language model (LLM) obtained by pruning Nemotron-4 15B; specifically, we prune model embedding size, number of attention heads, and MLP intermediate dimension. Following pruning, we perform continued training with distillation using 94 billion tokens to arrive at the final model; we use the continuous pre-training data corpus used in Nemotron-4 15B for this purpose.