初始化项目,由ModelHub XC社区提供模型
Model: KDEGroup/SWE-AGILE-RL-8B Source: Original Platform
This commit is contained in:
48
README.md
Normal file
48
README.md
Normal file
@@ -0,0 +1,48 @@
|
||||
---
|
||||
tags:
|
||||
- agent
|
||||
pipeline_tag: text-generation
|
||||
library_name: transformers
|
||||
---
|
||||
|
||||
# SWE-AGILE
|
||||
|
||||
## 📣 News
|
||||
|
||||
[2026/02/23] SWE-AGILE has been accepted to the ACL 2026 Findings.
|
||||
|
||||
<font size=4><div align='center' > [[📖 Paper](https://huggingface.co/papers/2604.11716)] [[🤗 Checkpoints](https://huggingface.co/KDEGroup)] [[🤗 Daily Paper](https://huggingface.co/papers/2604.11716)] [[🚀 Github](https://github.com/KDEGroup/SWE-AGILE)]</div></font>
|
||||
|
||||
## 🔥 Overview
|
||||
|
||||
Prior approaches typically lack the **explicit System-2 reasoning** required for deep analysis. While recent reasoning models demonstrate the potential of extended Chain-of-Thought (CoT), applying them to multi-turn tasks creates a dilemma: retaining full history leads to **context explosion**, while discarding it causes **redundant re-reasoning**.
|
||||
|
||||
We propose SWE-AGILE, a novel software agent framework designed to bridge the gap between reasoning depth, efficiency, and context constraints. SWE-AGILE introduces a Dynamic Reasoning Context strategy, maintaining a “sliding window” of detailed reasoning for immediate continuity to prevent redundant re-analyzing, while compressing historical reasoning content into concise Reasoning Digests via **Backfilling Data Synthesis**, **Trajectory Snapshot Training** and **Compression-Aware Optimization**.
|
||||
|
||||
|
||||
|
||||
While our current paradigm implicitly reduces redundant state reconstruction, a highly promising direction to strictly enforce this efficiency is to quantitatively monitor the reasoning content. By calculating the embedding similarity between consecutive reasoning steps or employing an LLM-as-a-Judge, future iterations can explicitly filter out repetitive SFT trajectories or design targeted RLVR penalties, pushing the boundary of cognitive efficiency even further.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
## ⭐️ Citation
|
||||
|
||||
If you find this project useful, please cite our work:
|
||||
|
||||
```bibtex
|
||||
@misc{lian2026sweagilesoftwareagentframework,
|
||||
title={SWE-AGILE: A Software Agent Framework for Efficiently Managing Dynamic Reasoning Context},
|
||||
author={Shuquan Lian and Juncheng Liu and Yazhe Chen and Yuhong Chen and Hui Li},
|
||||
year={2026},
|
||||
eprint={2604.11716},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.AI},
|
||||
url={https://arxiv.org/abs/2604.11716},
|
||||
}
|
||||
```
|
||||
|
||||
## 🤝 Acknowledgements
|
||||
|
||||
We sincerely thank the projects [R2E-Gym/R2E-Gym](https://github.com/R2E-Gym/R2E-Gym) and [rllm-org/rllm](https://github.com/rllm-org/rllm) for providing their open-source resources.
|
||||
Reference in New Issue
Block a user