Related papers: CodePlan: Repository-level Coding using LLMs and P…

Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning

Despite the remarkable success of large language models (LLMs) on traditional natural language processing tasks, their planning ability remains a critical bottleneck in tackling complex multi-step reasoning tasks. Existing approaches mainly…

Computation and Language · Computer Science 2024-10-07 Jiaxin Wen , Jian Guan , Hongning Wang , Wei Wu , Minlie Huang

A Review of Repository Level Prompting for LLMs

As coding challenges become more complex, recent advancements in Large Language Models (LLMs) have led to notable successes, such as achieving a 94.6\% solve rate on the HumanEval benchmark. Concurrently, there is an increasing commercial…

Software Engineering · Computer Science 2023-12-19 Douglas Schonholtz

CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion

Repository-level code completion automatically predicts the unfinished code based on the broader information from the repository. Recent strides in Code Large Language Models (code LLMs) have spurred the development of repository-level code…

Computation and Language · Computer Science 2025-09-22 Sheng Zhang , Yifan Ding , Shuquan Lian , Shun Song , Hui Li

Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs

Large Language Models (LLMs) have recently made significant advances in code generation through the 'Chain-of-Thought' prompting technique. This technique empowers the model to autonomously devise "solution plans" to tackle intricate…

Software Engineering · Computer Science 2024-03-21 Zhihong Sun , Chen Lyu , Bolun Li , Yao Wan , Hongyu Zhang , Ge Li , Zhi Jin

Self-planning Code Generation with Large Language Models

Although large language models (LLMs) have demonstrated impressive ability in code generation, they are still struggling to address the complicated intent provided by humans. It is widely acknowledged that humans typically employ planning…

Software Engineering · Computer Science 2025-10-21 Xue Jiang , Yihong Dong , Lecheng Wang , Zheng Fang , Qiwei Shang , Ge Li , Zhi Jin , Wenpin Jiao

Teaching Code LLMs to Use Autocompletion Tools in Repository-Level Code Generation

Code large language models (LLMs) face limitations in repository-level code generation due to their lack of awareness of repository-level dependencies (e.g., user-defined attributes), resulting in dependency errors such as…

Software Engineering · Computer Science 2024-07-19 Chong Wang , Jian Zhang , Yebo Feng , Tianlin Li , Weisong Sun , Yang Liu , Xin Peng

CoreCodeBench: Decoupling Code Intelligence via Fine-Grained Repository-Level Tasks

The evaluation of Large Language Models (LLMs) for software engineering has shifted towards complex, repository-level tasks. However, existing benchmarks predominantly rely on coarse-grained pass rates that treat programming proficiency as…

Software Engineering · Computer Science 2026-01-08 Lingyue Fu , Hao Guan , Bolun Zhang , Haowei Yuan , Yaoming Zhu , Jun Xu , Zongyu Wang , Lin Qiu , Xunliang Cai , Xuezhi Cao , Weiwen Liu , Weinan Zhang , Yong Yu

RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices

Writing code requires significant time and effort in software development. To automate this process, researchers have made substantial progress using Large Language Models (LLMs) for code generation. Many benchmarks like HumanEval and…

Software Engineering · Computer Science 2026-04-27 Jia Li , Hongyi Deng , Yiran Zhang , Kechi Zhang , Tianqi Shao , Tiankuo Zhao , Weinan Wang , Zhi Jin , Ge Li , Yang Liu , Yingtao Fang , Yihong Dong

Repository-level Code Search with Neural Retrieval Methods

This paper presents a multi-stage reranking system for repository-level code search, which leverages the vastly available commit histories of large open-source repositories to aid in bug fixing. We define the task of repository-level code…

Information Retrieval · Computer Science 2025-02-12 Siddharth Gandhi , Luyu Gao , Jamie Callan

RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models

Large Language Models (LLMs) have exhibited significant proficiency in code debugging, especially in automatic program repair, which may substantially reduce the time consumption of developers and enhance their efficiency. Significant…

Software Engineering · Computer Science 2025-09-09 Jingjing Liu , Zeming Liu , Zihao Cheng , Mengliang He , Xiaoming Shi , Yuhang Guo , Xiangrong Zhu , Yuanfang Guo , Yunhong Wang , Haifeng Wang

RepoTransBench: A Real-World Multilingual Benchmark for Repository-Level Code Translation

Repository-level code translation refers to translating an entire code repository from one programming language to another while preserving the functionality of the source repository. Many benchmarks have been proposed to evaluate the…

Software Engineering · Computer Science 2025-12-17 Yanli Wang , Yanlin Wang , Suiquan Wang , Daya Guo , Jiachi Chen , John Grundy , Xilin Liu , Yuchi Ma , Mingzhi Mao , Hongyu Zhang , Zibin Zheng

Interactive and Expressive Code-Augmented Planning with Large Language Models

Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making, but often struggle with complex, long-horizon planning tasks. Recent techniques have sought to structure LLM outputs using…

Computation and Language · Computer Science 2024-11-22 Anthony Z. Liu , Xinhe Wang , Jacob Sansom , Yao Fu , Jongwook Choi , Sungryull Sohn , Jaekyeom Kim , Honglak Lee

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Large Language Models (LLMs) for code are rapidly evolving, with code editing emerging as a critical capability. We introduce CodeEditorBench, an evaluation framework designed to rigorously assess the performance of LLMs in code editing…

Software Engineering · Computer Science 2025-04-09 Jiawei Guo , Ziming Li , Xueling Liu , Kaijing Ma , Tianyu Zheng , Zhouliang Yu , Ding Pan , Yizhi LI , Ruibo Liu , Yue Wang , Shuyue Guo , Xingwei Qu , Xiang Yue , Ge Zhang , Wenhu Chen , Jie Fu

CATCODER: Repository-Level Code Generation with Relevant Code and Type Context

Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, repository-level code generation presents unique challenges, particularly due to the need to utilize information spread across…

Software Engineering · Computer Science 2025-11-24 Zhiyuan Pan , Xing Hu , Xin Xia , Xiaohu Yang

On The Importance of Reasoning for Context Retrieval in Repository-Level Code Editing

Recent advancements in code-fluent Large Language Models (LLMs) enabled the research on repository-level code editing. In such tasks, the model navigates and modifies the entire codebase of a project according to request. Hence, such tasks…

Software Engineering · Computer Science 2024-06-10 Alexander Kovrigin , Aleksandra Eliseeva , Yaroslav Zharov , Timofey Bryksin

A Survey on Large Language Models for Code Generation

Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This…

Computation and Language · Computer Science 2025-10-28 Juyong Jiang , Fan Wang , Jiasi Shen , Sungju Kim , Sunghun Kim

CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules

Large Language Models (LLMs) have already become quite proficient at solving simpler programming tasks like those in HumanEval or MBPP benchmarks. However, solving more complex and competitive programming tasks is still quite challenging…

Artificial Intelligence · Computer Science 2024-03-15 Hung Le , Hailin Chen , Amrita Saha , Akash Gokul , Doyen Sahoo , Shafiq Joty

RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing

Code auditing is the process of reviewing code with the aim of identifying bugs. Large Language Models (LLMs) have demonstrated promising capabilities for this task without requiring compilation, while also supporting user-friendly…

Software Engineering · Computer Science 2025-06-02 Jinyao Guo , Chengpeng Wang , Xiangzhe Xu , Zian Su , Xiangyu Zhang

CodeTaste: Can LLMs Generate Human-Level Code Refactorings?

Large language model (LLM) coding agents can generate working code, but their solutions often accumulate complexity, duplication, and architectural debt. Human developers address such issues through refactoring: behavior-preserving program…

Software Engineering · Computer Science 2026-03-05 Alex Thillen , Niels Mündler , Veselin Raychev , Martin Vechev

AlignCoder: Aligning Retrieval with Target Intent for Repository-Level Code Completion

Repository-level code completion remains a challenging task for existing code large language models (code LLMs) due to their limited understanding of repository-specific context and domain knowledge. While retrieval-augmented generation…

Software Engineering · Computer Science 2026-01-28 Tianyue Jiang , Yanli Wang , Yanlin Wang , Daya Guo , Ensheng Shi , Yuchi Ma , Jiachi Chen , Zibin Zheng