Related papers: LocAgent: Graph-Guided LLM Agents for Code Localiz…

GraphCodeAgent: Dual Graph-Guided LLM Agent for Retrieval-Augmented Repo-Level Code Generation

Writing code requires significant time and effort in software development. To automate this process, researchers have made substantial progress for code generation. Recently, large language models (LLMs) have demonstrated remarkable…

Software Engineering · Computer Science 2025-11-19 Jia Li , Xianjie Shi , Kechi Zhang , Ge Li , Zhi Jin , Lei Li , Huangzhao Zhang , Jia Li , Fang Liu , Yuwei Zhang , Zhengwei Tao , Yihong Dong , Yuqi Zhu , Chongyang Tao

CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases

Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. This challenge has prompted research on enhancing LLM-codebase interaction at a repository scale.…

Software Engineering · Computer Science 2024-08-13 Xiangyan Liu , Bo Lan , Zhiyuan Hu , Yang Liu , Zhicheng Zhang , Fei Wang , Michael Shieh , Wenmeng Zhou

TransAgent: Enhancing LLM-Based Code Translation via Fine-Grained Execution Alignment

Code translation transforms code between programming languages while preserving functionality, which is critical in software development and maintenance. While traditional learning-based code translation methods have limited effectiveness…

Software Engineering · Computer Science 2026-04-08 Zhiqiang Yuan , Weitong Chen , Hanlin Wang , Xin Peng , Zhenpeng Chen , Yiling Lou

LARGER: Lexically Anchored Repository Graph Exploration and Retrieval

Repository-level coding agents must first localize the files and symbols relevant to a task; failures at this stage can cascade across downstream objectives ranging from patch generation to test writing and codebase question answering.…

Information Retrieval · Computer Science 2026-05-19 Yuntong Hu , Tongli Su , Liang Zhao , Bowen Zhu , Hasibul Haque

DomAgent: Leveraging Knowledge Graphs and Case-Based Reasoning for Domain-Specific Code Generation

Large language models (LLMs) have shown impressive capabilities in code generation. However, because most LLMs are trained on public domain corpora, directly applying them to real-world software development often yields low success rates,…

Artificial Intelligence · Computer Science 2026-03-26 Shuai Wang , Dhasarathy Parthasarathy , Robert Feldt , Yinan Yu

An LLM Agent for Automatic Geospatial Data Analysis

Large language models (LLMs) are being used in data science code generation tasks, but they often struggle with complex sequential tasks, leading to logical errors. Their application to geospatial data processing is particularly challenging…

Computers and Society · Computer Science 2024-10-28 Yuxing Chen , Weijie Wang , Sylvain Lobry , Camille Kurtz

SweRank: Software Issue Localization with Code Ranking

Software issue localization, the task of identifying the precise code locations (files, classes, or functions) relevant to a natural language issue description (e.g., bug report, feature request), is a critical yet time-consuming aspect of…

Software Engineering · Computer Science 2026-04-23 Revanth Gangi Reddy , Tarun Suresh , JaeHyeok Doo , Ye Liu , Xuan Phi Nguyen , Yingbo Zhou , Semih Yavuz , Caiming Xiong , Heng Ji , Shafiq Joty

CodeAgent: Autonomous Communicative Agents for Code Review

Code review, which aims at ensuring the overall quality and reliability of software, is a cornerstone of software development. Unfortunately, while crucial, Code review is a labor-intensive process that the research community is looking to…

Software Engineering · Computer Science 2024-09-26 Xunzhu Tang , Kisub Kim , Yewei Song , Cedric Lothritz , Bei Li , Saad Ezzini , Haoye Tian , Jacques Klein , Tegawende F. Bissyande

Improving Code Localization with Repository Memory

Code localization is a fundamental challenge in repository-level software engineering tasks such as bug fixing. While existing methods equip language agents with comprehensive tools/interfaces to fetch information from the repository, they…

Software Engineering · Computer Science 2026-02-10 Boshi Wang , Weijian Xu , Yunsheng Li , Mei Gao , Yujia Xie , Huan Sun , Dongdong Chen

ResearchCodeAgent: An LLM Multi-Agent System for Automated Codification of Research Methodologies

In this paper we introduce ResearchCodeAgent, a novel multi-agent system leveraging large language models (LLMs) agents to automate the codification of research methodologies described in machine learning literature. The system bridges the…

Software Engineering · Computer Science 2025-05-06 Shubham Gandhi , Dhruv Shah , Manasi Patwardhan , Lovekesh Vig , Gautam Shroff

Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks

Recent advances in Large Language Models (LLMs) have shown promise in function-level code generation, yet repository-level software engineering tasks remain challenging. Current solutions predominantly rely on proprietary LLM agents, which…

Software Engineering · Computer Science 2025-06-25 Hongyuan Tao , Ying Zhang , Zhenhao Tang , Hongen Peng , Xukun Zhu , Bingchang Liu , Yingguang Yang , Ziyin Zhang , Zhaogui Xu , Haipeng Zhang , Linchao Zhu , Rui Wang , Hang Yu , Jianguo Li , Peng Di

Multi-CoLoR: Context-Aware Localization and Reasoning across Multi-Language Codebases

Large language models demonstrate strong capabilities in code generation but struggle to navigate complex, multi-language repositories to locate relevant code. Effective code localization requires understanding both organizational context…

Software Engineering · Computer Science 2026-02-24 Indira Vats , Sanjukta De , Subhayan Roy , Saurabh Bodhe , Lejin Varghese , Max Kiehn , Yonas Bedasso , Marsha Chechik

Understanding Codebase like a Professional! Human-AI Collaboration for Code Comprehension

Understanding an unfamiliar codebase is an essential task for developers in various scenarios, such as during the onboarding process. Especially when the codebase is large and time is limited, achieving a decent level of comprehension…

Human-Computer Interaction · Computer Science 2026-02-16 Jie Gao , Yue Xue , Xiaofei Xie , SoeMin Thant , Erika Lee , Bowen Xu

CodeNav: Beyond tool-use to using real-world codebases with LLM agents

We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. In contrast to tool-use LLM agents that require ``registration'' of all relevant tools via manual descriptions within…

Artificial Intelligence · Computer Science 2024-06-19 Tanmay Gupta , Luca Weihs , Aniruddha Kembhavi

CodeScout: An Effective Recipe for Reinforcement Learning of Code Search Agents

A prerequisite for coding agents to perform tasks on large repositories is code localization - the identification of relevant files, classes, and functions to work on. While repository-level code localization has been performed using…

Software Engineering · Computer Science 2026-03-19 Lintang Sutawika , Aditya Bharat Soni , Bharath Sriraam R R , Apurva Gandhi , Taha Yassine , Sanidhya Vijayvargiya , Yuchen Li , Xuhui Zhou , Yilin Zhang , Leander Melroy Maben , Graham Neubig

Meta-RAG on Large Codebases Using Code Summarization

Large Language Model (LLM) systems have been at the forefront of applied Artificial Intelligence (AI) research in a multitude of domains. One such domain is software development, where researchers have pushed the automation of a number of…

Software Engineering · Computer Science 2025-08-08 Vali Tawosi , Salwa Alamir , Xiaomo Liu , Manuela Veloso

LogicLens: Leveraging Semantic Code Graph to explore Multi Repository large systems

Understanding large software systems is a challenging task, especially when code is distributed across multiple repositories and microservices. Developers often need to reason not only about the structure of the code, but also about its…

Software Engineering · Computer Science 2026-01-19 Niko Usai , Dario Montagnini , Kristian Ilianov Iliev , Raffaele Camanzo

GraphCogent: Mitigating LLMs' Working Memory Constraints via Multi-Agent Collaboration in Complex Graph Understanding

Large language models (LLMs) show promising performance on small-scale graph reasoning tasks but fail when handling real-world graphs with complex queries. This phenomenon arises from LLMs' working memory constraints, which result in their…

Artificial Intelligence · Computer Science 2025-10-01 Rongzheng Wang , Shuang Liang , Qizhi Chen , Yihong Huang , Muquan Li , Yizhuo Ma , Dongyang Zhang , Ke Qin , Man-Fai Leung

GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration

Graphs are widely used for modeling relational data in real-world scenarios, such as social networks and urban computing. Existing LLM-based graph analysis approaches either integrate graph neural networks (GNNs) for specific machine…

Artificial Intelligence · Computer Science 2025-11-04 Xin Li , Qizhi Chu , Yubin Chen , Yang Liu , Yaoqi Liu , Zekai Yu , Weize Chen , Chen Qian , Chuan Shi , Cheng Yang

FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation

Assisting non-expert users to develop complex interactive websites has become a popular task for LLM-powered code agents. However, existing code agents tend to only generate frontend web pages, masking the lack of real full-stack data…

Software Engineering · Computer Science 2026-02-04 Zimu Lu , Houxing Ren , Yunqiao Yang , Ke Wang , Zhuofan Zong , Mingjie Zhan , Hongsheng Li