Files
ModelHub XC faea67b324 初始化项目,由ModelHub XC社区提供模型
Model: haoranli-ml/Llama-3-8B-CoPE-64k-Instruct
Source: Original Platform
2026-06-03 03:31:19 +08:00

38 lines
1.6 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model:
- meta-llama/Meta-Llama-3-8B
language:
- en
library_name: transformers
pipeline_tag: text-generation
license: llama3
---
## haoranli-ml/Llama-3-8B-CoPE-64k-Instruct
[![Paper](https://img.shields.io/badge/CoPE_paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2602.05258)
[![GitHub](https://img.shields.io/badge/GitHub-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/hrlics/CoPE)
### ✨ Overview
**CoPE** is a plug-and-play enhancement of RoPE that *softly* clips the unstable low-frequency components, delivering consistent gains both **within the training context** and during **long-context extrapolation**.
With a simple yet effective soft clipping strategy, CoPE:
1**Eliminates severe OOD outliers**, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.
2**Refines Long-range Semantic Signals** by alleviating the secret *long-term decay of semantic attention* introduced by RoPE.
3**Prevents Spectral Leakage** induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.
For more details on training and evaluation, please refer to the [official GitHub repository](https://github.com/hrlics/CoPE).
### 📖 Citation
```
@article{li2026cope,
title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs},
author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng},
journal={arXiv preprint arXiv:2602.05258},
year={2026}
}
```