English
Related papers

Related papers: RLCoder: Reinforcement Learning for Repository-Lev…

200 papers

Repository-level code completion remains a challenging task for existing code large language models (code LLMs) due to their limited understanding of repository-specific context and domain knowledge. While retrieval-augmented generation…

Software Engineering · Computer Science 2026-01-28 Tianyue Jiang , Yanli Wang , Yanlin Wang , Daya Guo , Ensheng Shi , Yuchi Ma , Jiachi Chen , Zibin Zheng

The task of repository-level code completion is to continue writing the unfinished code based on a broader context of the repository. While for automated code completion tools, it is difficult to utilize the useful information scattered in…

Computation and Language · Computer Science 2023-10-23 Fengji Zhang , Bei Chen , Yue Zhang , Jacky Keung , Jin Liu , Daoguang Zan , Yi Mao , Jian-Guang Lou , Weizhu Chen

Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed.…

The performance of repository-level code completion depends upon the effective leverage of both general and repository-specific knowledge. Despite the impressive capability of code LLMs in general code completion tasks, they often exhibit…

Software Engineering · Computer Science 2024-09-16 Wei Liu , Ailun Yu , Daoguang Zan , Bo Shen , Wei Zhang , Haiyan Zhao , Zhi Jin , Qianxiang Wang

Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a…

Software Engineering · Computer Science 2024-06-05 Di Wu , Wasi Uddin Ahmad , Dejiao Zhang , Murali Krishna Ramanathan , Xiaofei Ma

Repository-level code generation has attracted growing attention in recent years. Unlike function-level code generation, it requires the model to understand the entire repository, reasoning over complex dependencies across functions,…

Software Engineering · Computer Science 2026-05-07 Chao Hu , Wenhao Zeng , Yuling Shi , Beijun Shen , Xiaodong Gu

Repository-level code completion automatically predicts the unfinished code based on the broader information from the repository. Recent strides in Code Large Language Models (code LLMs) have spurred the development of repository-level code…

Computation and Language · Computer Science 2025-09-22 Sheng Zhang , Yifan Ding , Shuquan Lian , Shun Song , Hui Li

Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, repository-level code generation presents unique challenges, particularly due to the need to utilize information spread across…

Software Engineering · Computer Science 2025-11-24 Zhiyuan Pan , Xing Hu , Xin Xia , Xiaohu Yang

While Large Language Models (LLMs) have revolutionized code generation, standard ``System 1'' approaches that generate solutions in a single forward pass often hit a performance ceiling on complex algorithmic tasks. Existing iterative…

Computation and Language · Computer Science 2026-04-21 Juyong Jiang , Jiasi Shen , Sunghun Kim , Kang Min Yoo , Jeonghoon Kim , Sungju Kim

Large Language Models (LLMs) exhibit remarkable code generation capabilities but falter when adapting to frequent updates in external library APIs. This critical limitation, stemming from reliance on outdated API knowledge from their…

Computation and Language · Computer Science 2025-11-25 Haoze Wu , Yunzhi Yao , Wenhao Yu , Ningyu Zhang

The utilization of programming language (PL) models, pre-trained on large-scale code corpora, as a means of automating software engineering processes has demonstrated considerable potential in streamlining various code generation tasks such…

Machine Learning · Computer Science 2023-07-21 Parshin Shojaee , Aneesh Jain , Sindhu Tipirneni , Chandan K. Reddy

Large language models (LLMs) have achieved substantial progress in repository-level code generation. However, solving the same repository-level task often requires multiple attempts, while existing methods still optimize each attempt in…

Software Engineering · Computer Science 2026-04-07 Ruwei Pan , Jiangshuai Wang , Qisheng Zhang , Yueheng Zhu , Linhao Wu , Zixiong Yang , Yakun Zhang , Lu Zhang , Hongyu Zhang

Training LLMs for code-related tasks typically depends on high-quality code-documentation pairs, which are costly to curate and often scarce for niche programming languages. We introduce BatCoder, a self-supervised reinforcement learning…

The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code…

Despite Retrieval-Augmented Generation improving code completion, traditional retrieval methods struggle with information redundancy and a lack of diversity within limited context windows. To solve this, we propose a resource-optimized…

Software Engineering · Computer Science 2025-10-14 Xiaohan Chen , Zhongying Pan , Quan Feng , Yu Tian , Shuqun Yang , Mengru Wang , Lina Gong , Yuxia Geng , Piji Li , Xiang Chen

Large Language Models (LLMs) have achieved remarkable success in code completion, as evidenced by their essential roles in developing code assistant services such as Copilot. Being trained on in-file contexts, current LLMs are quite…

Software Engineering · Computer Science 2024-02-20 Yichen Li , Yun Peng , Yintong Huo , Michael R. Lyu

Multimodal document retrieval systems enable information access across text, images, and layouts, benefiting various domains like document-based question answering, report analysis, and interactive content summarization. Rerankers improve…

Artificial Intelligence · Computer Science 2025-06-24 Mingjun Xu , Jinhan Dong , Jue Hou , Zehui Wang , Sihang Li , Zhifeng Gao , Renxin Zhong , Hengxing Cai

Large Language Models (LLMs) excel at general code generation, but their performance drops sharply in enterprise settings that rely on internal private libraries absent from public pre-training corpora. While Retrieval-Augmented Generation…

Software Engineering · Computer Science 2026-04-28 Mofei Li , Taozhi Chen , Guowei Yang , Jia Li

Code generation is a latency-sensitive task that demands high timeliness. However, with the growing interest and inherent difficulty in repository-level code generation, most existing code generation studies focus on improving the…

Artificial Intelligence · Computer Science 2025-10-01 Qianhui Zhao , Li Zhang , Fang Liu , Xiaoli Lian , Qiaoyuanhe Meng , Ziqian Jiao , Zetong Zhou , Jia Li , Lin Shi

Recently, the use of large language models (LLMs) for software code generation, e.g., C/C++ and Python, has proven a great success. However, LLMs still suffer from low syntactic and functional correctness when it comes to the generation of…

Hardware Architecture · Computer Science 2024-07-29 Mingzhe Gao , Jieru Zhao , Zhe Lin , Wenchao Ding , Xiaofeng Hou , Yu Feng , Chao Li , Minyi Guo
‹ Prev 1 2 3 10 Next ›