Update README.md

2025-02-14 19:04:05 +00:00
parent 70fa5997af
commit e2c30f2d4c
1 changed files with 150 additions and 143 deletions
--- a/README.md
+++ b/README.md
@@ -3,8 +3,15 @@ license: other
 license_name: nvidia-open-model-license
 license_link: >-
  https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
 library_name: transformers
 pipeline_tag: text-generation
 language:
  - en
 tags:
  - nvidia
  - llama-3
  - pytorch  
 ---
 # Model Overview
 Minitron-8B-Base is a large language model (LLM) obtained by pruning Nemotron-4 15B; specifically, we prune model embedding size, number of attention heads, and MLP intermediate dimension. Following pruning, we perform continued training with distillation using 94 billion tokens to arrive at the final model; we use the continuous pre-training data corpus used in Nemotron-4 15B for this purpose.