Related papers: STALL+: Boosting LLM-based Repository-level Code C…

Augmenting Large Language Models with Static Code Analysis for Automated Code Quality Improvements

This study examined code issue detection and revision automation by integrating Large Language Models (LLMs) such as OpenAI's GPT-3.5 Turbo and GPT-4o into software development workflows. A static code analysis framework detects issues such…

Software Engineering · Computer Science 2025-06-13 Seyed Moein Abtahi , Akramul Azim

CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion

Repository-level code completion automatically predicts the unfinished code based on the broader information from the repository. Recent strides in Code Large Language Models (code LLMs) have spurred the development of repository-level code…

Computation and Language · Computer Science 2025-09-22 Sheng Zhang , Yifan Ding , Shuquan Lian , Shun Song , Hui Li

Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches

Recent advances in large language models (LLMs) have significantly improved automated code generation. While existing approaches have achieved strong performance at the function and file levels, real-world software engineering requires…

Software Engineering · Computer Science 2026-05-21 Yicheng Tao , Yuante Li , Yao Qin , Yepang Liu

Combining Static Code Analysis and Large Language Models Improves Correctness and Performance of Algorithm Recognition

Context: Since it is well-established that developers spend a substantial portion of their time understanding source code, the ability to automatically identify algorithms within source code presents a valuable opportunity. This capability…

Software Engineering · Computer Science 2026-04-06 Denis Neumüller , Sebastian Boll , David Schüler , Matthias Tichy

Repoformer: Selective Retrieval for Repository-Level Code Completion

Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a…

Software Engineering · Computer Science 2024-06-05 Di Wu , Wasi Uddin Ahmad , Dejiao Zhang , Murali Krishna Ramanathan , Xiaofei Ma

AlignCoder: Aligning Retrieval with Target Intent for Repository-Level Code Completion

Repository-level code completion remains a challenging task for existing code large language models (code LLMs) due to their limited understanding of repository-specific context and domain knowledge. While retrieval-augmented generation…

Software Engineering · Computer Science 2026-01-28 Tianyue Jiang , Yanli Wang , Yanlin Wang , Daya Guo , Ensheng Shi , Yuchi Ma , Jiachi Chen , Zibin Zheng

When Retrieval Hurts Code Completion: A Diagnostic Study of Stale Repository Context

Context: Retrieval-augmented code generation relies on cross-file repository context, but retrieved snippets may come from obsolete project states. Objectives: We study whether temporally stale repository snippets act as harmless noise or…

Software Engineering · Computer Science 2026-05-15 Haojun Weng , Qianqian Yang , Hao Fu , Haobin Pan , Xinwei Lv

LLM-Based Static Verification of Code Against Natural-Language Requirements: An Industrial Experience Report

Large language models (LLMs) are increasingly used to generate requirements specifications, design documents, code, and test cases. In contrast, much less attention has been given to a more difficult assurance problem: statically verifying…

Software Engineering · Computer Science 2026-05-19 Zhi Quan Zhou , Dave Towey , Tsong Yueh Chen

GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model

The performance of repository-level code completion depends upon the effective leverage of both general and repository-specific knowledge. Despite the impressive capability of code LLMs in general code completion tasks, they often exhibit…

Software Engineering · Computer Science 2024-09-16 Wei Liu , Ailun Yu , Daoguang Zan , Bo Shen , Wei Zhang , Haiyan Zhao , Zhi Jin , Qianxiang Wang

Frustrated with Code Quality Issues? LLMs can Help!

As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality…

Artificial Intelligence · Computer Science 2023-09-25 Nalin Wadhwa , Jui Pradhan , Atharv Sonwane , Surya Prakash Sahu , Nagarajan Natarajan , Aditya Kanade , Suresh Parthasarathy , Sriram Rajamani

Do Code LLMs Do Static Analysis?

This paper investigates code LLMs' capability of static analysis during code intelligence tasks such as code summarization and generation. Code LLMs are now household names for their abilities to do some programming tasks that have…

Software Engineering · Computer Science 2026-03-27 Chia-Yi Su , Collin McMillan

Static Analysis as a Feedback Loop: Enhancing LLM-Generated Code Beyond Correctness

Large language models (LLMs) have demonstrated impressive capabilities in code generation, achieving high scores on benchmarks such as HumanEval and MBPP. However, these benchmarks primarily assess functional correctness and neglect broader…

Software Engineering · Computer Science 2025-08-21 Scott Blyth , Sherlock A. Licorish , Christoph Treude , Markus Wagner

Combining Code Embedding with Static Analysis for Function-Call Completion

Code completion is an important feature of integrated development environments (IDEs). It allows developers to produce code faster, especially novice ones who are not fully familiar with APIs and others code. Previous works on code…

Software Engineering · Computer Science 2020-11-03 M. Weyssow , H. Sahraoui , B. Frénay , B. Vanderose

A Deep Dive into Retrieval-Augmented Generation for Code Completion: Experience on WeChat

Code completion, a crucial task in software engineering that enhances developer productivity, has seen substantial improvements with the rapid advancement of large language models (LLMs). In recent years, retrieval-augmented generation…

Software Engineering · Computer Science 2025-07-25 Zezhou Yang , Ting Peng , Cuiyun Gao , Chaozheng Wang , Hailiang Huang , Yuetang Deng

Retrieval-augmented code completion for local projects using large language models

The use of large language models (LLMs) is becoming increasingly widespread among software developers. However, privacy and computational requirements are problematic with commercial solutions and the use of LLMs. In this work, we focus on…

Software Engineering · Computer Science 2025-06-17 Marko Hostnik , Marko Robnik-Šikonja

Exploring Code Analysis: Zero-Shot Insights on Syntax and Semantics with LLMs

Code analysis is fundamental in Software Engineering, supporting debugging, optimization, and security assessment. Human developers approach it through syntax parsing, static semantics inference, and dynamic reasoning. Traditional tools are…

Software Engineering · Computer Science 2026-05-22 Wei Ma , Zhihao Lin , Shangqing Liu , Qiang Hu , Ye Liu , Wenhan Wang , Cen Zhang , Liming Nie , Li Li , Yang Liu , Lingxiao Jiang

GraphCodeAgent: Dual Graph-Guided LLM Agent for Retrieval-Augmented Repo-Level Code Generation

Writing code requires significant time and effort in software development. To automate this process, researchers have made substantial progress for code generation. Recently, large language models (LLMs) have demonstrated remarkable…

Software Engineering · Computer Science 2025-11-19 Jia Li , Xianjie Shi , Kechi Zhang , Ge Li , Zhi Jin , Lei Li , Huangzhao Zhang , Jia Li , Fang Liu , Yuwei Zhang , Zhengwei Tao , Yihong Dong , Yuqi Zhu , Chongyang Tao

When LLMs Meet API Documentation: Can Retrieval Augmentation Aid Code Generation Just as It Helps Developers?

Retrieval-augmented generation (RAG) has increasingly shown its power in extending large language models' (LLMs') capability beyond their pre-trained knowledge. Existing works have shown that RAG can help with software development tasks…

Software Engineering · Computer Science 2025-03-20 Jingyi Chen , Songqiang Chen , Jialun Cao , Jiasi Shen , Shing-Chi Cheung

Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion

Recent years have witnessed the deployment of code language models (LMs) in various code intelligence tasks such as code completion. Yet, it is challenging for pre-trained LMs to generate correct completions in private repositories.…

Software Engineering · Computer Science 2024-05-31 Wei Cheng , Yuhan Wu , Wei Hu

Failure-Aware Enhancements for Large Language Model (LLM) Code Generation: An Empirical Study on Decision Framework

Large language models (LLMs) show promise for automating software development by translating requirements into code. However, even advanced prompting workflows like progressive prompting often leave some requirements unmet. Although methods…

Software Engineering · Computer Science 2026-02-04 Jianru Shen , Zedong Peng , Lucy Owen