Files
Llama-3-8B-CoPE-64k-Instruct/README.md
ModelHub XC faea67b324 初始化项目,由ModelHub XC社区提供模型
Model: haoranli-ml/Llama-3-8B-CoPE-64k-Instruct
Source: Original Platform
2026-06-03 03:31:19 +08:00

38 lines
1.6 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model:
- meta-llama/Meta-Llama-3-8B
language:
- en
library_name: transformers
pipeline_tag: text-generation
license: llama3
---
## haoranli-ml/Llama-3-8B-CoPE-64k-Instruct
[![Paper](https://img.shields.io/badge/CoPE_paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2602.05258)
[![GitHub](https://img.shields.io/badge/GitHub-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/hrlics/CoPE)
### ✨ Overview
**CoPE** is a plug-and-play enhancement of RoPE that *softly* clips the unstable low-frequency components, delivering consistent gains both **within the training context** and during **long-context extrapolation**.
With a simple yet effective soft clipping strategy, CoPE:
1**Eliminates severe OOD outliers**, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.
2**Refines Long-range Semantic Signals** by alleviating the secret *long-term decay of semantic attention* introduced by RoPE.
3**Prevents Spectral Leakage** induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.
For more details on training and evaluation, please refer to the [official GitHub repository](https://github.com/hrlics/CoPE).
### 📖 Citation
```
@article{li2026cope,
title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs},
author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng},
journal={arXiv preprint arXiv:2602.05258},
year={2026}
}
```