Files
WildLlama-7b-assistant-only/README.md
ModelHub XC f83ffda50b 初始化项目,由ModelHub XC社区提供模型
Model: allenai/WildLlama-7b-assistant-only
Source: Original Platform
2026-04-10 11:49:07 +08:00

100 lines
4.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
datasets:
- allenai/WildChat-1M
- allenai/WildChat-1M-Full
- allenai/WildChat
extra_gated_prompt: >-
Access to this model is automatically granted upon accepting the [**AI2
ImpACT License - Medium Risk Artifacts (“MR
Agreement”)**](https://allenai.org/licenses/impact-mr) and completing all
fields below.
extra_gated_fields:
Your full name: text
Organization or entity you are affiliated with: text
State or country you are located in: text
Contact email: text
Please describe your intended use of the medium risk artifact(s): text
I UNDERSTAND that the model is intended for research purposes and not for real-world use-cases: checkbox
I AGREE to the terms and conditions of the MR Agreement above: checkbox
I AGREE to AI2s use of my information for legal notices and administrative matters: checkbox
I CERTIFY that the information I have provided is true and accurate: checkbox
---
# Model Card for WildLlama-7b-assistant-only
## Model Description
The WildLlama-7b-assistant-only model is a chatbot derived from the [Llama-2 model by Meta](https://huggingface.co/meta-llama/Llama-2-7b-hf) that is licensed under the [Llama 2 License](https://ai.meta.com/resources/models-and-libraries/llama-downloads/), enhanced through fine-tuning on the [WildChat Dataset](https://huggingface.co/datasets/allenai/WildChat)'s user-ChatGPT interactions. WildLlama-7b-assistant-only is trained to predict **only assistant responses**. To be able to both predict user prompts and assistant responses, check out [WildLlama-7b-user-assistant](https://huggingface.co/allenai/WildLlama-7b-user-assistant).
- **Model type:** Language model
- **Language(s) (NLP):** multi-lingual
- **License:** [**AI2
ImpACT License - Medium Risk Artifacts ("MR
Agreement")**](https://allenai.org/licenses/impact-mr)
- **Parent Model:** https://huggingface.co/meta-llama/Llama-2-7b-hf
- **Paper:** https://arxiv.org/abs/2405.01470
- **Visualization Tool:** https://wildvisualizer.com
- **Visualization Paper:** https://arxiv.org/abs/2409.03753
# Bias, Risks, and Limitations
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
## Recommendations
We recommend that this model not be used for any high-impact or human-facing purposes as its biases and limitations need to be further explored.
We intend this to be a research artifact to advance AI's ability to better serve human needs.
# Citation
**BibTeX:**
```
@inproceedings{
zhao2024wildchat,
title={WildChat: 1M Chat{GPT} Interaction Logs in the Wild},
author={Wenting Zhao and Xiang Ren and Jack Hessel and Claire Cardie and Yejin Choi and Yuntian Deng},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=Bl8u7ZRlbM}
}
```
```
@misc{deng2024wildvisopensourcevisualizer,
title={WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild},
author={Yuntian Deng and Wenting Zhao and Jack Hessel and Xiang Ren and Claire Cardie and Yejin Choi},
year={2024},
eprint={2409.03753},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2409.03753},
}
```
# How to Get Started with the Model
Use the code below to get started with the model.
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
device = "cuda" if torch.cuda.is_available() else "cpu"
model_name = 'allenai/WildLlama-7b-assistant-only'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(device)
# Notice the </s>! Note that the format is slightly different from allenai/WildLlama-7b-user-assistant
# Format: A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: abc ASSISTANT: def</s>USER: def ASSISTANT: adfs</s>USER: asdf
# To generate an assistant response
user_prompt = 'Write a story about a dinosaur on an airplane.'
prompt = f"A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {user_prompt} ASSISTANT:"
model_inputs = tokenizer(prompt, return_tensors='pt', add_special_tokens=False).to(device)
output = model.generate(**model_inputs)
print("Output:\n" + 100 * '-')
print(tokenizer.decode(output[0], skip_special_tokens=True))
```