xVerify-7B-I/README.md

---
base_model:
- Qwen/Qwen2.5-7B-Instruct
language:
- en
- zh
license: cc-by-nc-nd-4.0
tags:
- instruction-finetuning
library_name: transformers
pipeline_tag: text-generation
inference: false
---

<h1 align="center">
🔍 xVerify-7B-I
</h1>

<p align="center">
  <div style="display: flex; justify-content: center; gap: 10px;">
    <a href="https://github.com/IAAR-Shanghai/xVerify">
      <img src="https://img.shields.io/badge/GitHub-Repository-blue?logo=github" alt="GitHub"/>
    </a>
    <a href="https://huggingface.co/IAAR-Shanghai/xVerify-7B-I">
      <img src="https://img.shields.io/badge/🤗%20Hugging%20Face-xVerify--7B--I-yellow" alt="Hugging Face"/>
    </a>
  </div>
</p>

xVerify is an evaluation tool fine-tuned from a pre-trained large language model, designed specifically for objective questions with a single correct answer. It is presented in the paper [xVerify: Efficient Answer Verifier for Reasoning Model Evaluations](https://huggingface.co/papers/2504.10481).

It accurately extracts the final answer from lengthy reasoning processes and efficiently identifies equivalence across different forms of expressions.

---

## ✨ Key Features

### 📊 Broad Applicability
Suitable for various objective question evaluation scenarios including math problems, multiple-choice questions, classification tasks, and short-answer questions.

### ⛓️ Handles Long Reasoning Chains
Effectively processes answers with extensive reasoning steps to extract the final answer, regardless of complexity.

### 🌐 Multilingual Support
Primarily handles Chinese and English responses while remaining compatible with other languages.

### 🔄 Powerful Equivalence Judgment
- ✓ Recognizes basic transformations like letter case changes and Greek letter conversions
- ✓ Identifies equivalent mathematical expressions across formats (LaTeX, fractions, scientific notation)
- ✓ Determines semantic equivalence in natural language answers
- ✓ Matches multiple-choice responses by content rather than just option identifiers

---

## 🚀 Sample Usage

This snippet demonstrates single-sample evaluation using the `Evaluator` logic provided in the [official repository](https://github.com/IAAR-Shanghai/xVerify).

```python
from src.xVerify.model import Model
from src.xVerify.eval import Evaluator

# initialization
model_name = 'xVerify-7B-I'
model_path = 'IAAR-Shanghai/xVerify-7B-I'
inference_mode = 'local' 

model = Model(
    model_name=model_name,
    model_path_or_url=model_path,
    inference_mode=inference_mode,
)
evaluator = Evaluator(model=model)

# input evaluation information
question = "New steel giant includes Lackawanna site A major change is coming to the global steel industry and a galvanized mill in Lackawanna that formerly belonged to Bethlehem Steel Corp.
Classify the topic of the above sentence as World, Sports, Business, or Sci/Tech."
llm_output = "The answer is Business."
correct_answer = "Business"

# evaluation
result = evaluator.single_evaluate(
    question=question,
    llm_output=llm_output,
    correct_answer=correct_answer
)
print(result)
```

---

## 📚 Citation

```bibtex
@article{xVerify,
      title={xVerify: Efficient Answer Verifier for Reasoning Model Evaluations}, 
      author={Ding Chen and Qingchen Yu and Pengyuan Wang and Wentao Zhang and Bo Tang and Feiyu Xiong and Xinchi Li and Minchuan Yang and Zhiyu Li},
      journal={arXiv preprint arXiv:2504.10481},
      year={2025},
}
```
初始化项目，由ModelHub XC社区提供模型 Model: IAAR-Shanghai/xVerify-7B-I Source: Original Platform 2026-05-05 19:52:37 +08:00			`---`
			`base_model:`
			`- Qwen/Qwen2.5-7B-Instruct`
			`language:`
			`- en`
			`- zh`
			`license: cc-by-nc-nd-4.0`
			`tags:`
			`- instruction-finetuning`
			`library_name: transformers`
			`pipeline_tag: text-generation`
			`inference: false`
			`---`

			`<h1 align="center">`
			`🔍 xVerify-7B-I`
			`</h1>`

			`<p align="center">`
			`<div style="display: flex; justify-content: center; gap: 10px;">`
			`<a href="https://github.com/IAAR-Shanghai/xVerify">`
			`<img src="https://img.shields.io/badge/GitHub-Repository-blue?logo=github" alt="GitHub"/>`
			`</a>`
			`<a href="https://huggingface.co/IAAR-Shanghai/xVerify-7B-I">`
			`<img src="https://img.shields.io/badge/🤗%20Hugging%20Face-xVerify--7B--I-yellow" alt="Hugging Face"/>`
			`</a>`
			`</div>`
			`</p>`

			`xVerify is an evaluation tool fine-tuned from a pre-trained large language model, designed specifically for objective questions with a single correct answer. It is presented in the paper [xVerify: Efficient Answer Verifier for Reasoning Model Evaluations](https://huggingface.co/papers/2504.10481).`

			`It accurately extracts the final answer from lengthy reasoning processes and efficiently identifies equivalence across different forms of expressions.`

			`---`

			`## ✨ Key Features`

			`### 📊 Broad Applicability`
			`Suitable for various objective question evaluation scenarios including math problems, multiple-choice questions, classification tasks, and short-answer questions.`

			`### ⛓️ Handles Long Reasoning Chains`
			`Effectively processes answers with extensive reasoning steps to extract the final answer, regardless of complexity.`

			`### 🌐 Multilingual Support`
			`Primarily handles Chinese and English responses while remaining compatible with other languages.`

			`### 🔄 Powerful Equivalence Judgment`
			`- ✓ Recognizes basic transformations like letter case changes and Greek letter conversions`
			`- ✓ Identifies equivalent mathematical expressions across formats (LaTeX, fractions, scientific notation)`
			`- ✓ Determines semantic equivalence in natural language answers`
			`- ✓ Matches multiple-choice responses by content rather than just option identifiers`

			`---`

			`## 🚀 Sample Usage`

			This snippet demonstrates single-sample evaluation using the `Evaluator` logic provided in the [official repository](https://github.com/IAAR-Shanghai/xVerify).

			```python
			`from src.xVerify.model import Model`
			`from src.xVerify.eval import Evaluator`

			`# initialization`
			`model_name = 'xVerify-7B-I'`
			`model_path = 'IAAR-Shanghai/xVerify-7B-I'`
			`inference_mode = 'local'`

			`model = Model(`
			`model_name=model_name,`
			`model_path_or_url=model_path,`
			`inference_mode=inference_mode,`
			`)`
			`evaluator = Evaluator(model=model)`

			`# input evaluation information`
			`question = "New steel giant includes Lackawanna site A major change is coming to the global steel industry and a galvanized mill in Lackawanna that formerly belonged to Bethlehem Steel Corp.`
			`Classify the topic of the above sentence as World, Sports, Business, or Sci/Tech."`
			`llm_output = "The answer is Business."`
			`correct_answer = "Business"`

			`# evaluation`
			`result = evaluator.single_evaluate(`
			`question=question,`
			`llm_output=llm_output,`
			`correct_answer=correct_answer`
			`)`
			`print(result)`
			```

			`---`

			`## 📚 Citation`

			```bibtex
			`@article{xVerify,`
			`title={xVerify: Efficient Answer Verifier for Reasoning Model Evaluations},`
			`author={Ding Chen and Qingchen Yu and Pengyuan Wang and Wentao Zhang and Bo Tang and Feiyu Xiong and Xinchi Li and Minchuan Yang and Zhiyu Li},`
			`journal={arXiv preprint arXiv:2504.10481},`
			`year={2025},`
			`}`
			```