87 lines
2.9 KiB
Markdown
87 lines
2.9 KiB
Markdown
# Lingma SWE-GPT: Software Engineering Large Language Model
|
|
|
|
## Overview
|
|
|
|
Lingma SWE-GPT is an open-source large language model specifically designed for software engineering tasks. Built upon the foundation of the Qwen series base models, Lingma SWE-GPT has undergone additional training using software engineering development process data to enhance its capabilities in solving complex software engineering tasks.
|
|
|
|
## Model Introduction
|
|
|
|
Lingma SWE-GPT is a specialized model that focuses on addressing the unique challenges faced in software engineering. By leveraging the robust capabilities of the Qwen base models and incorporating domain-specific knowledge, this model aims to provide intelligent assistance across various aspects of software development.
|
|
|
|
|
|
## Model Performance
|
|
|
|
Lingma SWE-GPT has demonstrated impressive performance in software engineering tasks:
|
|
|
|
- Achieved a **30.20%(72B) and 18.20%(7B) solution rate on the authoritative SWE-bench Verified** leaderboard for software engineering intelligent agents.
|
|
- Outperforms other open-source models of similar scale in software engineering-specific tasks.
|
|
|
|
## How to use
|
|
|
|
### Run on SWE-bench
|
|
Refer to https://github.com/LingmaTongyi/Lingma-SWE-GPT
|
|
|
|
### Quick Start
|
|
```
|
|
from modelscope import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
model_name = "Lingma/Lingma-SWE-GPT-7B"
|
|
|
|
model = AutoModelForCausalLM.from_pretrained(
|
|
model_name,
|
|
torch_dtype="auto",
|
|
device_map="auto"
|
|
)
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
|
|
prompt = "Give me a short introduction to large language model."
|
|
messages = [
|
|
{"role": "system", "content": "You are Lingma, created by Tongyi Lingma team. You are a helpful assistant."},
|
|
{"role": "user", "content": prompt}
|
|
]
|
|
text = tokenizer.apply_chat_template(
|
|
messages,
|
|
tokenize=False,
|
|
add_generation_prompt=True
|
|
)
|
|
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
|
|
|
|
generated_ids = model.generate(
|
|
**model_inputs,
|
|
max_new_tokens=512
|
|
)
|
|
generated_ids = [
|
|
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
|
]
|
|
|
|
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
|
print(response)
|
|
```
|
|
|
|
## TODO
|
|
Currently only Python is supported. In future updates, we will provide more support for Java, JS/TS and other languages.
|
|
|
|
## License
|
|
|
|
This project is licensed under the GNU General Public License v2.0 (GPL-2.0).
|
|
|
|
## Contact
|
|
|
|
For any questions or feedback regarding Lingma SWE-GPT, please contact:
|
|
|
|
mayingwei.myw@alibaba-inc.com
|
|
|
|
## Acknowledgments
|
|
|
|
We would like to thank the Qwen team for their foundational work, which has been instrumental in the development of Lingma SWE-GPT.
|
|
|
|
## Citation
|
|
```
|
|
@article{ma2024understand,
|
|
title={How to Understand Whole Software Repository?},
|
|
author={Ma, Yingwei and Yang, Qingping and Cao, Rongyu and Li, Binhua and Huang, Fei and Li, Yongbin},
|
|
journal={arXiv preprint arXiv:2406.01422},
|
|
year={2024}
|
|
}
|
|
```
|