PaTaRM-8B/README.md

---
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-8B
tags:
- reward-model
- rlhf
- qwen3
---

# PaTaRM-8B

[![arXiv](https://img.shields.io/badge/arXiv-2510.24235-b31b1b.svg)](https://arxiv.org/abs/2510.24235)
[![GitHub](https://img.shields.io/badge/GitHub-PaTaRM-black?logo=github)](https://github.com/JaneEyre0530/PaTaRM)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)

This is the **PaTaRM-8B** model, part of the PaTaRM series. For full details including overview, usage examples, training data, and citation, please refer to the main collection README:

👉 **[AIJian/PaTaRM — Main README](https://huggingface.co/AIJian/PaTaRM)**

## Models

| Model | Base | Link |
|-------|------|------|
| PaTaRM-8B | Qwen3-8B | [AIJian/PaTaRM-8B](https://huggingface.co/AIJian/PaTaRM-8B) |
| PaTaRM-14B | Qwen3-14B | [AIJian/PaTaRM-14B](https://huggingface.co/AIJian/PaTaRM-14B) |

## Citation

```bibtex
@misc{jian2026patarmbridgingpairwisepointwise,
      title={PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling}, 
      author={Ai Jian and Jingqing Ruan and Xing Ma and Dailin Li and Weipeng Zhang and Ke Zeng and Xunliang Cai},
      year={2026},
      eprint={2510.24235},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2510.24235}, 
}
```
初始化项目，由ModelHub XC社区提供模型 Model: AIJian/PaTaRM-8B Source: Original Platform 2026-05-05 10:42:44 +08:00			`---`
			`library_name: transformers`
			`license: apache-2.0`
			`pipeline_tag: text-generation`
			`base_model:`
			`- Qwen/Qwen3-8B`
			`tags:`
			`- reward-model`
			`- rlhf`
			`- qwen3`
			`---`

			`# PaTaRM-8B`

			`[![arXiv](https://img.shields.io/badge/arXiv-2510.24235-b31b1b.svg)](https://arxiv.org/abs/2510.24235)`
			`[![GitHub](https://img.shields.io/badge/GitHub-PaTaRM-black?logo=github)](https://github.com/JaneEyre0530/PaTaRM)`
			`[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)`

			`This is the PaTaRM-8B model, part of the PaTaRM series. For full details including overview, usage examples, training data, and citation, please refer to the main collection README:`

			`👉 [AIJian/PaTaRM — Main README](https://huggingface.co/AIJian/PaTaRM)`

			`## Models`

			`\| Model \| Base \| Link \|`
			`\|-------\|------\|------\|`
			`\| PaTaRM-8B \| Qwen3-8B \| [AIJian/PaTaRM-8B](https://huggingface.co/AIJian/PaTaRM-8B) \|`
			`\| PaTaRM-14B \| Qwen3-14B \| [AIJian/PaTaRM-14B](https://huggingface.co/AIJian/PaTaRM-14B) \|`

			`## Citation`

			```bibtex
			`@misc{jian2026patarmbridgingpairwisepointwise,`
			`title={PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling},`
			`author={Ai Jian and Jingqing Ruan and Xing Ma and Dailin Li and Weipeng Zhang and Ke Zeng and Xunliang Cai},`
			`year={2026},`
			`eprint={2510.24235},`
			`archivePrefix={arXiv},`
			`primaryClass={cs.LG},`
			`url={https://arxiv.org/abs/2510.24235},`
			`}`
			```