ai-modelscope 806135ecc3 Add pipeline tag and link to Github repository (#1)
- Add pipeline tag and link to Github repository (b2cc1fb07ad79a789f61c32d76cdee4c9fea213b)
- Update README.md (75daac7f44d364e7bca45b7d2dfcb617587f073a)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
2025-03-12 04:01:03 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00
2024-12-12 03:05:07 +08:00

datasets, language, library_name, license, pipeline_tag
datasets language library_name license pipeline_tag
HuggingFaceH4/ultrafeedback_binarized
en
transformers mit text-generation

Zephyr-7B-DICE-Iter2

This model was developed using Bootstrapping Language Models with DPO Implicit Rewards (DICE) at iteration 2, based on the HuggingFaceH4/zephyr-7b-beta as the starting point.

Model Description

  • Model type: A 7B parameter GPT-like model fine-tuned on synthetic datasets.
  • Language(s) (NLP): Primarily English
  • License: MIT
  • Fine-tuned from model: HuggingFaceH4/zephyr-7b-beta

AlpacaEval Leaderboard Evaluation Results

Model LC. Win Rate Win Rate
Zephyr-7b-beta 12.69 10.71
Zephyr-7B-DICE-Iter1 19.03 17.67
Zephyr-7B-DICE-Iter2 20.71 20.16

Code

https://github.com/sail-sg/dice

Citation

@article{chen2024bootstrapping,
  title={Bootstrapping Language Models with DPO Implicit Rewards},
  author={Chen, Changyu and Liu, Zichen and Du, Chao and Pang, Tianyu and Liu, Qian and Sinha, Arunesh and Varakantham, Pradeep and Lin, Min},
  journal={arXiv preprint arXiv:2406.09760},
  year={2024}
}
Description
Model synced from source: sail/Zephyr-7B-DICE-Iter2
Readme 567 KiB