初始化项目,由ModelHub XC社区提供模型

Model: AIJian/PaTaRM-8B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-05 10:42:44 +08:00
commit f7a0c40e8d
19 changed files with 152368 additions and 0 deletions

42
README.md Normal file
View File

@@ -0,0 +1,42 @@
---
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-8B
tags:
- reward-model
- rlhf
- qwen3
---
# PaTaRM-8B
[![arXiv](https://img.shields.io/badge/arXiv-2510.24235-b31b1b.svg)](https://arxiv.org/abs/2510.24235)
[![GitHub](https://img.shields.io/badge/GitHub-PaTaRM-black?logo=github)](https://github.com/JaneEyre0530/PaTaRM)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
This is the **PaTaRM-8B** model, part of the PaTaRM series. For full details including overview, usage examples, training data, and citation, please refer to the main collection README:
👉 **[AIJian/PaTaRM — Main README](https://huggingface.co/AIJian/PaTaRM)**
## Models
| Model | Base | Link |
|-------|------|------|
| PaTaRM-8B | Qwen3-8B | [AIJian/PaTaRM-8B](https://huggingface.co/AIJian/PaTaRM-8B) |
| PaTaRM-14B | Qwen3-14B | [AIJian/PaTaRM-14B](https://huggingface.co/AIJian/PaTaRM-14B) |
## Citation
```bibtex
@misc{jian2026patarmbridgingpairwisepointwise,
title={PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling},
author={Ai Jian and Jingqing Ruan and Xing Ma and Dailin Li and Weipeng Zhang and Ke Zeng and Xunliang Cai},
year={2026},
eprint={2510.24235},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2510.24235},
}
```