Update README.md
This commit is contained in:
@@ -3,8 +3,15 @@ license: other
|
||||
license_name: nvidia-open-model-license
|
||||
license_link: >-
|
||||
https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
language:
|
||||
- en
|
||||
tags:
|
||||
- nvidia
|
||||
- llama-3
|
||||
- pytorch
|
||||
---
|
||||
|
||||
# Model Overview
|
||||
|
||||
Minitron-8B-Base is a large language model (LLM) obtained by pruning Nemotron-4 15B; specifically, we prune model embedding size, number of attention heads, and MLP intermediate dimension. Following pruning, we perform continued training with distillation using 94 billion tokens to arrive at the final model; we use the continuous pre-training data corpus used in Nemotron-4 15B for this purpose.
|
||||
|
||||
Reference in New Issue
Block a user