Files
Llama-3.1-8B-Instruct-MoAA-SFT/README.md
ModelHub XC fb4d7a4bb7 初始化项目,由ModelHub XC社区提供模型
Model: togethercomputer/Llama-3.1-8B-Instruct-MoAA-SFT
Source: Original Platform
2026-05-29 02:46:13 +08:00

74 lines
2.8 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
tags: []
---
## Model Description
This is the SFT model in our Mixture of Agents Alignment (MoAA) pipeline. This model is tuned on the Llama-3.1-8b-Instruct. MoAA is an approach that leverages collective intelligence from opensource LLMs to advance alignment.
Two mains stages are involved in our MoAA method. In the first stage, we employ MoA to produce high-quality synthetic data for supervised fine-tuning. In the second stage, we combines multiple LLMs as a reward model to provide preference annotations.
Some key takeaways of our work:
- 📈**Alignment pipeline that actually works** Our MoAA method sends Llama3.18BInstructs ArenaHard **1948** and Gemma-2-9B-it **42→56**, handily beating GPT4olabeled sets at the time.
- 🏆**Ensembled rewards > single critics** An MoA reward model with dynamic criteria filtering edges out competitive ArmoRM on MTBench & ArenaHard—all while staying 100% open source.
- 🚀**Selfimprovement unlocked** Finetune the strongest model inside the ensemble on MoAA data and it *surpasses its own teachers*—evidence that open models can push past proprietary ceilings without external supervision.
## Model Sources
For more details refer to
- **[Paper](https://arxiv.org/abs/2505.03059)**
<!-- - **[twitter](https://arxiv.org/abs/2505.03059)**
- **[blgopost](https://arxiv.org/abs/2505.03059)** -->
## How to Get Started with the Model
Use the code below to get started with the model.
Run inference like this:
```
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/Llama-3.1-8B-Instruct-MoAA-SFT")
model = AutoModelForCausalLM.from_pretrained("togethercomputer/Llama-3.1-8B-Instruct-MoAA-SFT")
```
## Training Data
Training data are located here: https://huggingface.co/datasets/togethercomputer/MoAA-SFT.
We subsample from two widely-used open-source instruction tuning datasets: UltraFeedback and UltraChat. Our subsampling strategy involves utilizing the entire UltraFeedback dataset and randomly selecting 5,000 samples from UltraChat.
We use MoA to generate responses. The proposers used in our study are WizardLM-2-8x22b, Gemma-2-7b-it, Qwen-2-72b-Instruct, and Llama-3.1-70b-Instruct, while Qwen-1.5-110b-Instruct serves as the aggregator.
## Evaluation & Performance
Refer to [Paper](https://arxiv.org/abs/2505.03059) for metrics.
## Citation
```
@article{wang2025improving,
title = {Improving Model Alignment Through Collective Intelligence of Open-Source LLMS},
author = {Junlin Wang and Roy Xie and Shang Zhu and Jue Wang and Ben Athiwaratkun and Bhuwan Dhingra and Shuaiwen Leon Song and Ce Zhang and James Zou},
year = {2025},
journal = {arXiv preprint arXiv: 2505.03059}
}
```