102 lines
5.1 KiB
Markdown
102 lines
5.1 KiB
Markdown
---
|
|
base_model: meta-llama/Llama-3.1-8B
|
|
library_name: transformers
|
|
model_name: notHumpback-M1-Rw-F-8b
|
|
tags:
|
|
- generated_from_trainer
|
|
- trl
|
|
- sft
|
|
license: apache-2.0
|
|
datasets:
|
|
- OpenAssistant/oasst1
|
|
- allenai/c4
|
|
---
|
|
|
|
# notHumpback-M1-Rw-F-8b
|
|
|
|
This model follows roughly follows the Humpback architecture, proposed in the paper [Self-Alignment with Instruction Backtranslation](https://arxiv.org/pdf/2308.06259)
|
|
by Li et al. An additional improvement, primarily inspired by the paper [Better Alignment with Instruction Back-and-Forth Translation](https://arxiv.org/abs/2408.04614) by Nguyen et al.,
|
|
is added at the end of the original pipeline.
|
|
|
|
The original Humpback uses instruction backtranslation on a web corpus to generate input-output pairs (self-augmentation),
|
|
creating a richer dataset for fine-tuning models without the need for additional manual annotation. For this, the documents from the web corpus are treated as theoretical responses,
|
|
for which then matching instructions are generated.
|
|
A copy of the base model, instruction-tuned on a small amount of "gold" instruction-response pairs, then iteratively curates the created dataset, scoring the pairs by quality, and is then finetuned on the resulting subset
|
|
of all pairs with the highest possible score (self-curation).
|
|
The pipeline by Nguyen et al. adds a third step called "Rewriting". During this step an already aligned LLM (e.g. LLaMa-2-70B-chat) is employed to rewrite those responses
|
|
that have passed the filtering at the self-curation step. The rewriting improves the linguistic quality of the responses, due to the nature of web-sourced texts, often containing colloquialisms
|
|
and stylistic noise. The final model is then finetuned on the rewritten dataset.
|
|
|
|
This approach inspired me to also add a rewriting step, performed not by an already aligned external LLM, but by the
|
|
["seed model"](https://huggingface.co/Alepach/notHumpback-M0), that also performs the filtering (self-curation). This approach intends to bring back the idea of
|
|
"Self-Alignment", since using an external model for rewriting deviates from the "self" aspect. In my pipeline the "self-rewriting" step is performed before self-curation,
|
|
so that the quality of the pairs is ensured after rewriting, allowing for more candidate pairs to be taken into consideration during filtering. This can be important for
|
|
leveraging the amount of data used, since some web documents have messy structure and would get filtered out when performing filtering first. The rewriting could potentially
|
|
restructure the response and thereby increase its quality and chance to be included in the final training data, potentially allowing for a greater, more diverse
|
|
final training dataset.
|
|
|
|
This model represents the resulting model after the first iteration of the pipeline, which is trained on a small amount of gold data
|
|
and a set of generated data rewritten and curated by the ["seed model"](https://huggingface.co/Alepach/notHumpback-M0).
|
|
|
|
This model can be used for instruction-following.
|
|
It may also be used to, again, rewrite and score the instruction-response pairs
|
|
generated by the ["backward model"](https://huggingface.co/Alepach/notHumpback-Myx) for a second iteration of the pipeline.
|
|
|
|
|
|
Varying from the original paper, this model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B).
|
|
It has been trained using [TRL](https://github.com/huggingface/trl).
|
|
|
|
The dataset used to train this model is a combination of data sampled from the [oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1)
|
|
dataset and the synthetic dataset which was mentioned above. The latter has been created by applying self-augmentation, self-rewriting and self-curation
|
|
on 502k entries from the english subset ("en") of the [c4](https://huggingface.co/datasets/allenai/c4) dataset.
|
|
|
|
### Framework versions
|
|
|
|
- TRL: 0.12.1
|
|
- Transformers: 4.46.3
|
|
- Pytorch: 2.5.1
|
|
- Datasets: 3.1.0
|
|
- Tokenizers: 0.20.3
|
|
|
|
## Citations
|
|
|
|
Original paper:
|
|
|
|
```bibtex
|
|
@misc{li2023selfalignment,
|
|
title={Self-Alignment with Instruction Backtranslation},
|
|
author={Xian Li and Ping Yu and Chunting Zhou and Timo Schick and Luke Zettlemoyer and Omer Levy and Jason Weston and Mike Lewis},
|
|
year={2023},
|
|
eprint={2308.06259},
|
|
archivePrefix={arXiv},
|
|
primaryClass={cs.CL}
|
|
}
|
|
```
|
|
|
|
Inspiration:
|
|
|
|
```bibtex
|
|
@misc{nguyen2024betteralignmentinstructionbackandforth,
|
|
title={Better Alignment with Instruction Back-and-Forth Translation},
|
|
author={Thao Nguyen and Jeffrey Li and Sewoong Oh and Ludwig Schmidt and Jason Weston and Luke Zettlemoyer and Xian Li},
|
|
year={2024},
|
|
eprint={2408.04614},
|
|
archivePrefix={arXiv},
|
|
primaryClass={cs.CL},
|
|
url={https://arxiv.org/abs/2408.04614},
|
|
}
|
|
```
|
|
|
|
Cite TRL as:
|
|
|
|
```bibtex
|
|
@misc{vonwerra2022trl,
|
|
title = {{TRL: Transformer Reinforcement Learning}},
|
|
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
|
|
year = 2020,
|
|
journal = {GitHub repository},
|
|
publisher = {GitHub},
|
|
howpublished = {\url{https://github.com/huggingface/trl}}
|
|
}
|
|
```
|