Files
WeThink-Qwen2.5VL-7B/README.md
ModelHub XC 1b08b3ebfb 初始化项目,由ModelHub XC社区提供模型
Model: yangjie-cv/WeThink-Qwen2.5VL-7B
Source: Original Platform
2026-05-23 10:16:39 +08:00

47 lines
1.5 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: apache-2.0
tags:
- Reinforcement Learning
- Visual-langauge Reasoning
---
# Model Card for WeThink-Qwen2.5VL-7B
Repository: https://github.com/yangjie-cv/WeThink
Paper: https://arxiv.org/abs/2506.07905
## 🏆 Performance Highlights
**WeThink-Qwen2.5VL-7B** achieves:
- 🥇 **1st place** on [OpenCompass Multimodal Reasoning Leaderboard](https://rank.opencompass.org.cn/leaderboard-multimodal-reasoning/?m=REALTIME)
- 🏅 **5th place** on [OpenCompass Multi-modal Academic Leaderboard](https://rank.opencompass.org.cn/leaderboard-multimodal/?m=REALTIME)
*(As of May 30th, 2025)*
## 🚀 Quick Start
### Inference
```bash
git clone https://github.com/yangjie-cv/WeThink
cd WeThink
python inference.py
```
💡 Note: System prompt is required during inference.
### 📊 Evaluation
We have integrated WeThink-Qwen2.5VL-7B into the [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). Please follow its [Quickstart guide](https://github.com/open-compass/VLMEvalKit/blob/main/docs/en/Quickstart.md) to evaluate WeThink-Qwen2.5VL-7B on various benchmarks.
## Citation
```
@misc{yang2025wethink,
title={WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning},
author={Jie Yang and Feipeng Ma and Zitian Wang and Dacheng Yin and Kang Rong and Fengyun Rao and Ruimao Zhang},
year={2025},
eprint={2506.07905},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.07905},
}
```