English
Related papers

Related papers: Improving Code Localization with Repository Memory

200 papers

A prerequisite for coding agents to perform tasks on large repositories is code localization - the identification of relevant files, classes, and functions to work on. While repository-level code localization has been performed using…

Large language model (LLM)-based coding agents achieve impressive results on controlled benchmarks yet routinely produce pull requests that real maintainers reject. The root cause is not functional incorrectness but a lack of organicity:…

Software Engineering · Computer Science 2026-03-30 Mo Li , L. H. Xu , Qitai Tan , Ting Cao , Yunxin Liu

Code localization--identifying precisely where in a codebase changes need to be made--is a fundamental yet challenging task in software maintenance. Existing approaches struggle to efficiently navigate complex codebases when identifying…

Software Engineering · Computer Science 2025-04-30 Zhaoling Chen , Xiangru Tang , Gangda Deng , Fang Wu , Jialong Wu , Zhiwei Jiang , Viktor Prasanna , Arman Cohan , Xingyao Wang

Bug localization remains a critical yet time-consuming challenge in large-scale software repositories. Traditional information retrieval-based bug localization (IRBL) methods rely on unchanged bug descriptions, which often contain noisy…

Software Engineering · Computer Science 2025-12-09 Genevieve Caumartin , Glaucia Melo

Optimizing the performance of large-scale software repositories demands expertise in code reasoning and software engineering (SWE) to reduce runtime while preserving program correctness. However, most benchmarks emphasize what to fix rather…

Code Search is a key task that many programmers often have to perform while developing solutions to problems. Current methodologies suffer from an inability to perform accurately on prompts that contain some ambiguity or ones that require…

Software Engineering · Computer Science 2024-08-22 Sarthak Jain , Aditya Dora , Ka Seng Sam , Prabhat Singh

Large language models (LLMs) exhibit strong performance on self-contained programming tasks. However, they still struggle with repository-level software engineering (SWE), which demands (1) deep codebase navigation with effective context…

Software Engineering · Computer Science 2026-05-27 Kang He , Kaushik Roy

The ultimate goal of code agents is to solve complex tasks autonomously. Although large language models (LLMs) have made substantial progress in code generation, real-world tasks typically demand full-fledged code repositories rather than…

Software Engineering · Computer Science 2025-08-26 Huacan Wang , Ziyi Ni , Shuo Zhang , Shuo Lu , Sen Hu , Ziyang He , Chen Hu , Jiaye Lin , Yifu Guo , Ronghao Chen , Xin Li , Daxin Jiang , Yuntao Du , Pin Lyu

Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code.…

Large Language Model (LLM) systems have been at the forefront of applied Artificial Intelligence (AI) research in a multitude of domains. One such domain is software development, where researchers have pushed the automation of a number of…

Software Engineering · Computer Science 2025-08-08 Vali Tawosi , Salwa Alamir , Xiaomo Liu , Manuela Veloso

Automated program repair (APR) has recently shifted toward large language models and agent-based systems, yet most systems rely on local snapshot context, overlooking repository history. Prior work shows that repository history helps repair…

Software Engineering · Computer Science 2026-04-03 Yu Shi , Hao Li , Bram Adams , Ahmed E. Hassan

Repository-level coding agents must first localize the files and symbols relevant to a task; failures at this stage can cascade across downstream objectives ranging from patch generation to test writing and codebase question answering.…

Information Retrieval · Computer Science 2026-05-19 Yuntong Hu , Tongli Su , Liang Zhao , Bowen Zhu , Hasibul Haque

Issue localization, which identifies faulty code elements such as files or functions, is critical for effective bug fixing. While recent LLM-based and LLM-agent-based approaches improve accuracy, they struggle in large-scale repositories…

Software Engineering · Computer Science 2025-10-07 Ying Wang , Wenjun Mao , Chong Wang , Zhenhao Zhou , Yicheng Zhou , Wenyun Zhao , Yiling Lou , Xin Peng

Large Language Model (LLM)-based coding agents have shown promising results on coding benchmarks, but their effectiveness on systems code remains underexplored. Due to the size and complexities of systems code, making changes to a systems…

Software Engineering · Computer Science 2026-05-21 Ramneet Singh , Sathvik Joel , Abhav Mehrotra , Nalin Wadhwa , Ramakrishna B Bairi , Aditya Kanade , Nagarajan Natarajan

Software issue localization, the task of identifying the precise code locations (files, classes, or functions) relevant to a natural language issue description (e.g., bug report, feature request), is a critical yet time-consuming aspect of…

Software Engineering · Computer Science 2026-04-23 Revanth Gangi Reddy , Tarun Suresh , JaeHyeok Doo , Ye Liu , Xuan Phi Nguyen , Yingbo Zhou , Semih Yavuz , Caiming Xiong , Heng Ji , Shafiq Joty

Code agents increasingly help developers work with unfamiliar repositories, but every such task depends on a costly prerequisite: bootstrapping the repository into a usable development state. This process requires substantial…

Software Engineering · Computer Science 2026-05-18 Sihan Fu , Oucheng Liu , Shiyuan Wang , Jin Shi , Chengkun Wei

Repository-aware code translation is critical for modernizing legacy systems, enhancing maintainability, and enabling interoperability across diverse programming languages. While recent advances in large language models (LLMs) have improved…

Software Engineering · Computer Science 2025-08-26 Ziqi Guan , Xin Yin , Zhiyuan Peng , Chao Ni

Generative models have demonstrated considerable potential in software engineering, particularly in tasks such as code generation and debugging. However, their utilization in the domain of code documentation generation remains…

Computation and Language · Computer Science 2024-02-27 Qinyu Luo , Yining Ye , Shihao Liang , Zhong Zhang , Yujia Qin , Yaxi Lu , Yesai Wu , Xin Cong , Yankai Lin , Yingli Zhang , Xiaoyin Che , Zhiyuan Liu , Maosong Sun

Large Language Models (LLMs) have enabled intelligent agents that autonomously interact with environments and invoke external tools. Recently, agent-based software repair has drawn wide attention, as repair agents can localize bugs,…

Software Engineering · Computer Science 2026-05-26 Quanjun Zhang , Chengyu Gao , Yu Han , Ye Shang , Chunrong Fang , Zhenyu Chen , Liang Xiao

This paper presents a multi-stage reranking system for repository-level code search, which leverages the vastly available commit histories of large open-source repositories to aid in bug fixing. We define the task of repository-level code…

Information Retrieval · Computer Science 2025-02-12 Siddharth Gandhi , Luyu Gao , Jamie Callan
‹ Prev 1 2 3 10 Next ›