Related papers: RESCUE: Retrieval Augmented Secure Code Generation

Give LLMs a Security Course: Securing Retrieval-Augmented Code Generation via Knowledge Injection

Retrieval-Augmented Code Generation (RACG) leverages external knowledge to enhance Large Language Models (LLMs) in code synthesis, improving the functional correctness of the generated code. However, existing RACG systems largely overlook…

Cryptography and Security · Computer Science 2025-04-24 Bo Lin , Shangwen Wang , Yihao Qin , Liqian Chen , Xiaoguang Mao

Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches

Recent advances in large language models (LLMs) have significantly improved automated code generation. While existing approaches have achieved strong performance at the function and file levels, real-world software engineering requires…

Software Engineering · Computer Science 2026-05-21 Yicheng Tao , Yuante Li , Yao Qin , Yepang Liu

ReCode: Improving LLM-based Code Repair with Fine-Grained Retrieval-Augmented Generation

Recent advances in large language models (LLMs) have demonstrated impressive capabilities in code-related tasks, such as code generation and automated program repair. Despite their promising performance, most existing approaches for code…

Software Engineering · Computer Science 2025-09-03 Yicong Zhao , Shisong Chen , Jiacheng Zhang , Zhixu Li

Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation

Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge.…

Cryptography and Security · Computer Science 2025-11-03 Arnabh Borah , Md Tanvirul Alam , Nidhi Rastogi

EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) compensates for the static knowledge limitations of Large Language Models (LLMs) by integrating external knowledge, producing responses with enhanced factual correctness and query-specific…

Computation and Language · Computer Science 2025-05-21 Ruobing Yao , Yifei Zhang , Shuang Song , Neng Gao , Chenyang Tu

SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

The indexing-retrieval-generation paradigm of retrieval-augmented generation (RAG) has been highly successful in solving knowledge-intensive tasks by integrating external knowledge into large language models (LLMs). However, the…

Cryptography and Security · Computer Science 2025-02-25 Xun Liang , Simin Niu , Zhiyu Li , Sensen Zhang , Hanyu Wang , Feiyu Xiong , Jason Zhaoxin Fan , Bo Tang , Shichao Song , Mengwei Wang , Jiawei Yang

A Deep Dive into Retrieval-Augmented Generation for Code Completion: Experience on WeChat

Code completion, a crucial task in software engineering that enhances developer productivity, has seen substantial improvements with the rapid advancement of large language models (LLMs). In recent years, retrieval-augmented generation…

Software Engineering · Computer Science 2025-07-25 Zezhou Yang , Ting Peng , Cuiyun Gao , Chaozheng Wang , Hailiang Huang , Yuetang Deng

RealSec-bench: A Benchmark for Evaluating Secure Code Generation in Real-World Repositories

Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area. Existing benchmarks often fall short by relying on synthetic…

Cryptography and Security · Computer Science 2026-02-02 Yanlin Wang , Ziyao Zhang , Chong Wang , Xinyi Xu , Mingwei Liu , Yong Wang , Jiachi Chen , Zibin Zheng

A Survey on Retrieval-Augmented Text Generation for Large Language Models

Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements to address the static limitations of large language models (LLMs) by enabling the dynamic integration of up-to-date external information. This…

Information Retrieval · Computer Science 2026-05-19 Yizheng Huang , Jimmy Huang

SACL: Understanding and Combating Textual Bias in Code Retrieval with Semantic-Augmented Reranking and Localization

Retrieval-Augmented Code Generation (RACG) is a critical technique for enhancing code generation by retrieving relevant information. In this work, we conduct an in-depth analysis of code retrieval by systematically masking specific features…

Computation and Language · Computer Science 2025-06-27 Dhruv Gupta , Gayathri Ganesh Lakshmy , Yiqing Xie

Improving LLM-Assisted Secure Code Generation through Retrieval-Augmented-Generation and Multi-Tool Feedback

Large Language Models (LLMs) can generate code but often introduce security vulnerabilities, logical inconsistencies, and compilation errors. Prior work demonstrates that LLMs benefit substantially from structured feedback, static analysis,…

Cryptography and Security · Computer Science 2026-01-05 Vidyut Sriram , Sawan Pandita , Achintya Lakshmanan , Aneesh Shamraj , Suman Saha

TrustRAG: Enhancing Robustness and Trustworthiness in Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by integrating external knowledge sources, enabling more accurate and contextually relevant responses tailored to user queries. These systems, however, remain…

Computation and Language · Computer Science 2025-05-26 Huichi Zhou , Kin-Hei Lee , Zhonghao Zhan , Yue Chen , Zhenhao Li , Zhaoyang Wang , Hamed Haddadi , Emine Yilmaz

Secure Retrieval-Augmented Generation against Poisoning Attacks

Large language models (LLMs) have transformed natural language processing (NLP), enabling applications from content generation to decision support. Retrieval-Augmented Generation (RAG) improves LLMs by incorporating external knowledge but…

Cryptography and Security · Computer Science 2025-11-11 Zirui Cheng , Jikai Sun , Anjun Gao , Yueyang Quan , Zhuqing Liu , Xiaohua Hu , Minghong Fang

Retrieval-Augmented Generation with Estimation of Source Reliability

Retrieval-Augmented Generation (RAG) is an effective approach to enhance the factual accuracy of large language models (LLMs) by retrieving information from external databases, which are typically composed of diverse sources, to supplement…

Machine Learning · Computer Science 2025-10-15 Jeongyeon Hwang , Junyoung Park , Hyejin Park , Dongwoo Kim , Sangdon Park , Jungseul Ok

Towards Reliable Retrieval in RAG Systems for Large Legal Datasets

Retrieval-Augmented Generation (RAG) is a promising approach to mitigate hallucinations in Large Language Models (LLMs) for legal applications, but its reliability is critically dependent on the accuracy of the retrieval step. This is…

Computation and Language · Computer Science 2025-10-09 Markus Reuter , Tobias Lingenberg , Rūta Liepiņa , Francesca Lagioia , Marco Lippi , Giovanni Sartor , Andrea Passerini , Burcu Sayin

Corrective Retrieval Augmented Generation

Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate. Although retrieval-augmented generation (RAG) is a practicable…

Computation and Language · Computer Science 2024-10-08 Shi-Qi Yan , Jia-Chen Gu , Yun Zhu , Zhen-Hua Ling

SelfRACG: Enabling LLMs to Self-Express and Retrieve for Code Generation

Existing retrieval-augmented code generation (RACG) methods typically use an external retrieval module to fetch semantically similar code snippets used for generating subsequent fragments. However, even for consecutive code fragments, the…

Information Retrieval · Computer Science 2025-10-10 Qian Dong , Jia Chen , Qingyao Ai , Hongning Wang , Haitao Li , Yi Wu , Yao Hu , Yiqun Liu , Shaoping Ma

CodeRAG-Bench: Can Retrieval Augment Code Generation?

While language models (LMs) have proven remarkably adept at generating code, many programs are challenging for LMs to generate using their parametric knowledge alone. Providing external contexts such as library documentation can facilitate…

Software Engineering · Computer Science 2025-02-28 Zora Zhiruo Wang , Akari Asai , Xinyan Velocity Yu , Frank F. Xu , Yiqing Xie , Graham Neubig , Daniel Fried

Secure Multifaceted-RAG for Enterprise: Hybrid Knowledge Retrieval with Security Filtering

Existing Retrieval-Augmented Generation (RAG) systems face challenges in enterprise settings due to limited retrieval scope and data security risks. When relevant internal documents are unavailable, the system struggles to generate accurate…

Computation and Language · Computer Science 2025-07-18 Grace Byun , Shinsun Lee , Nayoung Choi , Jinho D. Choi

RAGuard: A Novel Approach for in-context Safe Retrieval Augmented Generation for LLMs

Accuracy and safety are paramount in Offshore Wind (OSW) maintenance, yet conventional Large Language Models (LLMs) often fail when confronted with highly specialised or unexpected scenarios. We introduce RAGuard, an enhanced…

Artificial Intelligence · Computer Science 2025-09-05 Connor Walker , Koorosh Aslansefat , Mohammad Naveed Akram , Yiannis Papadopoulos