From 019a9d026e2f0bdd34d3a3d81758b8c719ed4502 Mon Sep 17 00:00:00 2001
From: Saurav Muralidharan <sauravm@nvidia.com>
Date: Tue, 23 Jul 2024 10:36:45 -0700
Subject: [PATCH] Add evaluation preview

---
 README.md | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/README.md b/README.md
index e3f0265..4f30bba 100644
--- a/README.md
+++ b/README.md
@@ -53,6 +53,29 @@ print(output_text)
 
 Minitron is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
 
+## Evaluation Results
+
+*5-shot performance.* Language Understanding evaluated using [Massive Multitask Language Understanding](https://arxiv.org/abs/2009.03300):
+
+| Average |
+| :---- |
+| 63.8 |
+
+*Zero-shot performance.* Evaluated using select datasets from the [LM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) with additions:
+
+HellaSwag | Winogrande | GSM8K| ARC-C | XLSum |
+| :------------- | :------------- | :------------- | :------------- | :------------- |
+| 80.7 | 79.0 | 51.3  | 52.6 | 31.2
+
+
+*Code generation performance*. Evaluated using [HumanEval](https://github.com/openai/human-eval):
+
+| p@1, 0-Shot |
+| :------------- |
+| 31.6 |
+
+Please refer to our [paper](https://arxiv.org/abs/2407.14679) for the full set of results.
+
 ## Citation
 
 If you find our work helpful, please consider citing our paper: