diff --git a/README.md b/README.md index 9252ff9..f2cf49e 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,13 @@ --- -library_name: transformers -license: mit datasets: - HuggingFaceH4/ultrafeedback_binarized language: - en +library_name: transformers +license: mit +pipeline_tag: text-generation --- + # Zephyr-7B-DICE-Iter2 This model was developed using [Bootstrapping Language Models with DPO Implicit Rewards](https://arxiv.org/abs/2406.09760) (DICE) at iteration 2, based on the [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) as the starting point. @@ -29,6 +31,9 @@ This model was developed using [Bootstrapping Language Models with DPO Implicit |[Zephyr-7B-DICE-Iter1](https://huggingface.co/sail/Zephyr-7B-DICE-Iter1) |19.03 |17.67 |[Zephyr-7B-DICE-Iter2](https://huggingface.co/sail/Zephyr-7B-DICE-Iter2) |**20.71** |**20.16** +## Code +https://github.com/sail-sg/dice + ## Citation ```bibtex