88 lines
3.0 KiB
Markdown
88 lines
3.0 KiB
Markdown
|
|
---
|
||
|
|
base_model:
|
||
|
|
- meta-llama/Llama-3.1-8B
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
license: cc-by-nc-4.0
|
||
|
|
pipeline_tag: feature-extraction
|
||
|
|
library_name: transformers
|
||
|
|
tags:
|
||
|
|
- sentence-transformers
|
||
|
|
---
|
||
|
|
|
||
|
|
## Model Summary
|
||
|
|
ReasonIR-8B is the first retriever specifically trained for general reasoning tasks, achieving the state-of-the-art retrieval performance
|
||
|
|
on BRIGHT (reasoning-intensive retrieval).
|
||
|
|
When employed for retrieval-augmented generation (RAG), ReasonIR-8B also brings substantial gains on MMLU and GPQA.
|
||
|
|
|
||
|
|
- Paper: https://arxiv.org/abs/2504.20595
|
||
|
|
- Repository: https://github.com/facebookresearch/ReasonIR
|
||
|
|
- Data: https://huggingface.co/datasets/reasonir/reasonir-data
|
||
|
|
|
||
|
|
## Usage
|
||
|
|
Make sure to install `transformers>=4.47.0` first!
|
||
|
|
|
||
|
|
### Transformers
|
||
|
|
|
||
|
|
```python
|
||
|
|
from transformers import AutoModel
|
||
|
|
|
||
|
|
model = AutoModel.from_pretrained("reasonir/ReasonIR-8B", torch_dtype="auto", trust_remote_code=True)
|
||
|
|
model = model.to("cuda")
|
||
|
|
model.eval()
|
||
|
|
|
||
|
|
query = "The quick brown fox jumps over the lazy dog."
|
||
|
|
document = "The quick brown fox jumps over the lazy dog."
|
||
|
|
query_instruction = ""
|
||
|
|
doc_instruction = ""
|
||
|
|
|
||
|
|
query_emb = model.encode(query, instruction=query_instruction)
|
||
|
|
doc_emb = model.encode(document, instruction=doc_instruction)
|
||
|
|
|
||
|
|
sim = query_emb @ doc_emb.T
|
||
|
|
```
|
||
|
|
|
||
|
|
When using `AutoModel`, it is important to:
|
||
|
|
|
||
|
|
1. Include `trust_remote_code=True` to make sure our custom bidirectional encoding architecture is used.
|
||
|
|
2. Use `torch_dtype="auto"` so that `bf16` is activated (by default torch will use `fp32`).
|
||
|
|
|
||
|
|
### Sentence Transformers
|
||
|
|
|
||
|
|
In addition to Transformers, you can also use this model with Sentence Transformers
|
||
|
|
|
||
|
|
```python
|
||
|
|
# pip install sentence-transformers
|
||
|
|
from sentence_transformers import SentenceTransformer
|
||
|
|
|
||
|
|
model_kwargs = {"torch_dtype": "auto"}
|
||
|
|
model = SentenceTransformer("reasonir/ReasonIR-8B", trust_remote_code=True, model_kwargs=model_kwargs)
|
||
|
|
|
||
|
|
query = "The quick brown fox jumps over the lazy dog."
|
||
|
|
document = "The quick brown fox jumps over the lazy dog."
|
||
|
|
query_instruction = ""
|
||
|
|
doc_instruction = ""
|
||
|
|
|
||
|
|
query_emb = model.encode(query, prompt=query_instruction)
|
||
|
|
doc_emb = model.encode(document, prompt=doc_instruction)
|
||
|
|
|
||
|
|
sim = model.similarity(query_emb, doc_emb)
|
||
|
|
```
|
||
|
|
|
||
|
|
It is important to also include `trust_remote_code=True` and `torch_dtype="auto"` as discussed earlier.
|
||
|
|
|
||
|
|
> [!NOTE]
|
||
|
|
> There are some very slight floating point discrepancies when using the model via SentenceTransformer caused by how the models are cast to the `bfloat16` dtype, though it should not affect the results in general.
|
||
|
|
|
||
|
|
We thank [@tomaarsen](https://huggingface.co/tomaarsen) for improving the SentenceTransformer integration and analyzing the cause of the floating point discrepancies!
|
||
|
|
|
||
|
|
## Citation
|
||
|
|
```
|
||
|
|
@article{shao2025reasonir,
|
||
|
|
title={ReasonIR: Training Retrievers for Reasoning Tasks},
|
||
|
|
author={Rulin Shao and Rui Qiao and Varsha Kishore and Niklas Muennighoff and Xi Victoria Lin and Daniela Rus and Bryan Kian Hsiang Low and Sewon Min and Wen-tau Yih and Pang Wei Koh and Luke Zettlemoyer},
|
||
|
|
year={2025},
|
||
|
|
journal={arXiv preprint arXiv:2504.20595},
|
||
|
|
url={https://arxiv.org/abs/2504.20595},
|
||
|
|
}
|
||
|
|
```
|