1.6 KiB
base_model, language, library_name, pipeline_tag, license
| base_model | language | library_name | pipeline_tag | license | ||
|---|---|---|---|---|---|---|
|
|
transformers | text-generation | llama3 |
haoranli-ml/Llama-3-8B-CoPE-64k-Instruct
✨ Overview
CoPE is a plug-and-play enhancement of RoPE that softly clips the unstable low-frequency components, delivering consistent gains both within the training context and during long-context extrapolation.
With a simple yet effective soft clipping strategy, CoPE:
1️⃣ Eliminates severe OOD outliers, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.
2️⃣ Refines Long-range Semantic Signals by alleviating the secret long-term decay of semantic attention introduced by RoPE.
3️⃣ Prevents Spectral Leakage induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.
For more details on training and evaluation, please refer to the official GitHub repository.
📖 Citation
@article{li2026cope,
title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs},
author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng},
journal={arXiv preprint arXiv:2602.05258},
year={2026}
}