43 lines
1.4 KiB
Markdown
43 lines
1.4 KiB
Markdown
|
|
---
|
||
|
|
library_name: transformers
|
||
|
|
license: apache-2.0
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
base_model:
|
||
|
|
- Qwen/Qwen3-8B
|
||
|
|
tags:
|
||
|
|
- reward-model
|
||
|
|
- rlhf
|
||
|
|
- qwen3
|
||
|
|
---
|
||
|
|
|
||
|
|
# PaTaRM-8B
|
||
|
|
|
||
|
|
[](https://arxiv.org/abs/2510.24235)
|
||
|
|
[](https://github.com/JaneEyre0530/PaTaRM)
|
||
|
|
[](LICENSE)
|
||
|
|
|
||
|
|
This is the **PaTaRM-8B** model, part of the PaTaRM series. For full details including overview, usage examples, training data, and citation, please refer to the main collection README:
|
||
|
|
|
||
|
|
👉 **[AIJian/PaTaRM — Main README](https://huggingface.co/AIJian/PaTaRM)**
|
||
|
|
|
||
|
|
## Models
|
||
|
|
|
||
|
|
| Model | Base | Link |
|
||
|
|
|-------|------|------|
|
||
|
|
| PaTaRM-8B | Qwen3-8B | [AIJian/PaTaRM-8B](https://huggingface.co/AIJian/PaTaRM-8B) |
|
||
|
|
| PaTaRM-14B | Qwen3-14B | [AIJian/PaTaRM-14B](https://huggingface.co/AIJian/PaTaRM-14B) |
|
||
|
|
|
||
|
|
## Citation
|
||
|
|
|
||
|
|
```bibtex
|
||
|
|
@misc{jian2026patarmbridgingpairwisepointwise,
|
||
|
|
title={PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling},
|
||
|
|
author={Ai Jian and Jingqing Ruan and Xing Ma and Dailin Li and Weipeng Zhang and Ke Zeng and Xunliang Cai},
|
||
|
|
year={2026},
|
||
|
|
eprint={2510.24235},
|
||
|
|
archivePrefix={arXiv},
|
||
|
|
primaryClass={cs.LG},
|
||
|
|
url={https://arxiv.org/abs/2510.24235},
|
||
|
|
}
|
||
|
|
```
|