Related papers: Completion by Comprehension: Guiding Code Generati…

ReACC: A Retrieval-Augmented Code Completion Framework

Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly…

Software Engineering · Computer Science 2022-03-16 Shuai Lu , Nan Duan , Hojae Han , Daya Guo , Seung-won Hwang , Alexey Svyatkovskiy

Impact-driven Context Filtering For Cross-file Code Completion

Retrieval-augmented generation (RAG) has recently demonstrated considerable potential for repository-level code completion, as it integrates cross-file knowledge with in-file preceding code to provide comprehensive contexts for generation.…

Software Engineering · Computer Science 2025-08-11 Yanzhou Li , Shangqing Liu , Kangjie Chen , Tianwei Zhang , Yang Liu

Repoformer: Selective Retrieval for Repository-Level Code Completion

Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a…

Software Engineering · Computer Science 2024-06-05 Di Wu , Wasi Uddin Ahmad , Dejiao Zhang , Murali Krishna Ramanathan , Xiaofei Ma

Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion

Recent years have witnessed the deployment of code language models (LMs) in various code intelligence tasks such as code completion. Yet, it is challenging for pre-trained LMs to generate correct completions in private repositories.…

Software Engineering · Computer Science 2024-05-31 Wei Cheng , Yuhan Wu , Wei Hu

CoFE-RAG: A Comprehensive Full-chain Evaluation Framework for Retrieval-Augmented Generation with Enhanced Data Diversity

Retrieval-Augmented Generation (RAG) aims to enhance large language models (LLMs) to generate more accurate and reliable answers with the help of the retrieved context from external knowledge sources, thereby reducing the incidence of…

Computation and Language · Computer Science 2024-10-17 Jintao Liu , Ruixue Ding , Linhao Zhang , Pengjun Xie , Fie Huang

GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model

The performance of repository-level code completion depends upon the effective leverage of both general and repository-specific knowledge. Despite the impressive capability of code LLMs in general code completion tasks, they often exhibit…

Software Engineering · Computer Science 2024-09-16 Wei Liu , Ailun Yu , Daoguang Zan , Bo Shen , Wei Zhang , Haiyan Zhao , Zhi Jin , Qianxiang Wang

RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation

The task of repository-level code completion is to continue writing the unfinished code based on a broader context of the repository. While for automated code completion tools, it is difficult to utilize the useful information scattered in…

Computation and Language · Computer Science 2023-10-23 Fengji Zhang , Bei Chen , Yue Zhang , Jacky Keung , Jin Liu , Daoguang Zan , Yi Mao , Jian-Guang Lou , Weizhu Chen

CodeKGC: Code Language Model for Generative Knowledge Graph Construction

Current generative knowledge graph construction approaches usually fail to capture structural knowledge by simply flattening natural language into serialized texts or a specification language. However, large generative language model…

Computation and Language · Computer Science 2024-01-19 Zhen Bi , Jing Chen , Yinuo Jiang , Feiyu Xiong , Wei Guo , Huajun Chen , Ningyu Zhang

CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion

Repository-level code completion automatically predicts the unfinished code based on the broader information from the repository. Recent strides in Code Large Language Models (code LLMs) have spurred the development of repository-level code…

Computation and Language · Computer Science 2025-09-22 Sheng Zhang , Yifan Ding , Shuquan Lian , Shun Song , Hui Li

CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing

Large Language Models have revolutionized code generation ability by converting natural language descriptions into executable code. However, generating complex code within real-world scenarios remains challenging due to intricate…

Software Engineering · Computer Science 2024-10-15 Xinyi He , Jiaru Zou , Yun Lin , Mengyu Zhou , Shi Han , Zejian Yuan , Dongmei Zhang

Prompt-based Code Completion via Multi-Retrieval Augmented Generation

Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence…

Software Engineering · Computer Science 2024-05-14 Hanzhuo Tan , Qi Luo , Ling Jiang , Zizheng Zhan , Jing Li , Haotian Zhang , Yuqun Zhang

CoCR-RAG: Enhancing Retrieval-Augmented Generation in Web Q&A via Concept-oriented Context Reconstruction

Retrieval-augmented generation (RAG) has shown promising results in enhancing Q&A by incorporating information from the web and other external sources. However, the supporting documents retrieved from the heterogeneous web often originate…

Computation and Language · Computer Science 2026-03-26 Kaize Shi , Xueyao Sun , Qika Lin , Firoj Alam , Qing Li , Xiaohui Tao , Guandong Xu

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Recent advancements in Unified Multimodal Models (UMMs) have significantly advanced text-to-image (T2I) generation, particularly through the integration of Chain-of-Thought (CoT) reasoning. However, existing CoT-based T2I methods largely…

Artificial Intelligence · Computer Science 2026-03-10 Haodong Li , Chunmei Qing , Huanyu Zhang , Dongzhi Jiang , Yihang Zou , Hongbo Peng , Dingming Li , Yuhong Dai , ZePeng Lin , Juanxi Tian , Yi Zhou , Siqi Dai , Jingwei Wu

RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion

Code Large Language Models (CodeLLMs) have demonstrated impressive proficiency in code completion tasks. However, they often fall short of fully understanding the extensive context of a project repository, such as the intricacies of…

Software Engineering · Computer Science 2024-08-15 Huy N. Phan , Hoang N. Phan , Tien N. Nguyen , Nghi D. Q. Bui

GrepRAG: An Empirical Study and Optimization of Grep-Like Retrieval for Code Completion

Repository-level code completion remains challenging for large language models (LLMs) due to cross-file dependencies and limited context windows. Prior work addresses this challenge using Retrieval-Augmented Generation (RAG) frameworks…

Software Engineering · Computer Science 2026-02-10 Baoyi Wang , Xingliang Wang , Guochang Li , Chen Zhi , Junxiao Han , Xinkui Zhao , Nan Wang , Shuiguang Deng , Jianwei Yin

Context-Augmented Code Generation Using Programming Knowledge Graphs

Large Language Models (LLMs) excel at code generation but struggle with complex problems. Retrieval-Augmented Generation (RAG) mitigates this issue by integrating external knowledge, yet retrieval models often miss relevant context, and…

Software Engineering · Computer Science 2026-01-29 Shahd Seddik , Fahd Seddik , Iman Saberi , Fatemeh Fard , Minh Hieu Huynh , Patanamon Thongtanunam

CoSe-Co: Text Conditioned Generative CommonSense Contextualizer

Pre-trained Language Models (PTLMs) have been shown to perform well on natural language tasks. Many prior works have leveraged structured commonsense present in the form of entities linked through labeled relations in Knowledge Graphs (KGs)…

Computation and Language · Computer Science 2022-06-20 Rachit Bansal , Milan Aggarwal , Sumit Bhatia , Jivat Neet Kaur , Balaji Krishnamurthy

Synergistic Integration and Discrepancy Resolution of Contextualized Knowledge for Personalized Recommendation

The integration of large language models (LLMs) into recommendation systems has revealed promising potential through their capacity to extract world knowledge for enhanced reasoning capabilities. However, current methodologies that adopt…

Information Retrieval · Computer Science 2025-10-17 Lingyu Mu , Hao Deng , Haibo Xing , Kaican Lin , Zhitong Zhu , Yu Zhang , Xiaoyi Zeng , Zhengxiao Liu , Zheng Lin , Jinxin Hu

Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search

In code search, the Generation-Augmented Retrieval (GAR) framework, which generates exemplar code snippets to augment queries, has emerged as a promising strategy to address the principal challenge of modality misalignment between code…

Software Engineering · Computer Science 2024-06-04 Haochen Li , Xin Zhou , Zhiqi Shen

Context-Augmented Code Generation Using Programming Knowledge Graphs

Large Language Models (LLMs) and Code-LLMs (CLLMs) have significantly improved code generation, but, they frequently face difficulties when dealing with challenging and complex problems. Retrieval-Augmented Generation (RAG) addresses this…

Software Engineering · Computer Science 2025-06-17 Iman Saberi , Fatemeh Fard