初始化项目,由ModelHub XC社区提供模型
Model: AIJian/PaTaRM-8B Source: Original Platform
This commit is contained in:
42
README.md
Normal file
42
README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
pipeline_tag: text-generation
|
||||
base_model:
|
||||
- Qwen/Qwen3-8B
|
||||
tags:
|
||||
- reward-model
|
||||
- rlhf
|
||||
- qwen3
|
||||
---
|
||||
|
||||
# PaTaRM-8B
|
||||
|
||||
[](https://arxiv.org/abs/2510.24235)
|
||||
[](https://github.com/JaneEyre0530/PaTaRM)
|
||||
[](LICENSE)
|
||||
|
||||
This is the **PaTaRM-8B** model, part of the PaTaRM series. For full details including overview, usage examples, training data, and citation, please refer to the main collection README:
|
||||
|
||||
👉 **[AIJian/PaTaRM — Main README](https://huggingface.co/AIJian/PaTaRM)**
|
||||
|
||||
## Models
|
||||
|
||||
| Model | Base | Link |
|
||||
|-------|------|------|
|
||||
| PaTaRM-8B | Qwen3-8B | [AIJian/PaTaRM-8B](https://huggingface.co/AIJian/PaTaRM-8B) |
|
||||
| PaTaRM-14B | Qwen3-14B | [AIJian/PaTaRM-14B](https://huggingface.co/AIJian/PaTaRM-14B) |
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@misc{jian2026patarmbridgingpairwisepointwise,
|
||||
title={PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling},
|
||||
author={Ai Jian and Jingqing Ruan and Xing Ma and Dailin Li and Weipeng Zhang and Ke Zeng and Xunliang Cai},
|
||||
year={2026},
|
||||
eprint={2510.24235},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.LG},
|
||||
url={https://arxiv.org/abs/2510.24235},
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user