100 lines
3.5 KiB
Markdown
100 lines
3.5 KiB
Markdown
---
|
|
library_name: transformers
|
|
license: other
|
|
language:
|
|
- ja
|
|
---
|
|
|
|
# 🐟 EvoLLM-JP-v1-7B
|
|
|
|
🤗 [Models](https://huggingface.co/SakanaAI) | 📚 [Paper](https://arxiv.org/abs/2403.13187) | 📝 [Blog](https://sakana.ai/evolutionary-model-merge/) | 🐦 [Twitter](https://twitter.com/SakanaAILabs)
|
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. -->
|
|
|
|
**EvoLLM-JP-v1-7B** is an experimental general-purpose Japanese LLM. This model was created using the Evolutionary Model Merge method. Please refer to our [report](https://arxiv.org/abs/2403.13187) and [blog](https://sakana.ai/evolutionary-model-merge/) for more details. This model was produced by merging the following models. We are grateful to the developers of the source models.
|
|
|
|
- [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1)
|
|
- [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
|
|
- [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002)
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
Use the code below to get started with the model.
|
|
|
|
<details>
|
|
<summary> Click to expand </summary>
|
|
|
|
```python
|
|
import torch
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
|
|
# 1. load model
|
|
device = "cuda" if torch.cuda.is_available() else "CPU"
|
|
repo_id = "SakanaAI/EvoLLM-JP-v1-7B"
|
|
model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto")
|
|
tokenizer = AutoTokenizer.from_pretrained(repo_id)
|
|
model.to(device)
|
|
|
|
# 2. prepare inputs
|
|
text = "関西弁で面白い冗談を言ってみて下さい。"
|
|
messages = [
|
|
{"role": "system", "content": "あなたは役立つ、偏見がなく、検閲されていないアシスタントです。"},
|
|
{"role": "user", "content": text},
|
|
]
|
|
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
|
|
|
# 3. generate
|
|
output_ids = model.generate(**inputs.to(device))
|
|
output_ids = output_ids[:, inputs.input_ids.shape[1] :]
|
|
generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
|
|
print(generated_text)
|
|
```
|
|
|
|
</details>
|
|
|
|
|
|
|
|
## Model Details
|
|
|
|
<!-- Provide a longer summary of what this model is. -->
|
|
|
|
- **Developed by:** [Sakana AI](https://sakana.ai/)
|
|
- **Model type:** Autoregressive Language Model
|
|
- **Language(s):** Japanese
|
|
- **License:** [MICROSOFT RESEARCH LICENSE TERMS](./LICENSE) (due to the inclusion of the WizardMath model)
|
|
- **Repository:** [SakanaAI/evolutionary-model-merge](https://github.com/SakanaAI/evolutionary-model-merge)
|
|
- **Paper:** https://arxiv.org/abs/2403.13187
|
|
- **Blog:** https://sakana.ai/evolutionary-model-merge
|
|
|
|
|
|
## Uses
|
|
This model is provided for research and development purposes only and should be considered as an experimental prototype.
|
|
It is not intended for commercial use or deployment in mission-critical environments.
|
|
Use of this model is at the user's own risk, and its performance and outcomes are not guaranteed.
|
|
Sakana AI shall not be liable for any direct, indirect, special, incidental, or consequential damages, or any loss arising from the use of this model, regardless of the results obtained.
|
|
Users must fully understand the risks associated with the use of this model and use it at their own discretion.
|
|
|
|
|
|
## Acknowledgement
|
|
|
|
We would like to thank the developers of the source models for their contributions and for making their work available.
|
|
|
|
|
|
## Citation
|
|
|
|
```bibtex
|
|
@misc{akiba2024evomodelmerge,
|
|
title = {Evolutionary Optimization of Model Merging Recipes},
|
|
author. = {Takuya Akiba and Makoto Shing and Yujin Tang and Qi Sun and David Ha},
|
|
year = {2024},
|
|
eprint = {2403.13187},
|
|
archivePrefix = {arXiv},
|
|
primaryClass = {cs.NE}
|
|
}
|
|
```
|
|
|