Related papers: RAGCache: Efficient Knowledge Caching for Retrieva…

Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks

Retrieval-augmented generation (RAG) has gained traction as a powerful approach for enhancing language models by integrating external knowledge sources. However, RAG introduces challenges such as retrieval latency, potential errors in…

Computation and Language · Computer Science 2025-02-25 Brian J Chan , Chao-Ting Chen , Jui-Hung Cheng , Hen-Hsen Huang

SubGCache: Accelerating Graph-based RAG with Subgraph-level KV Cache

Graph-based retrieval-augmented generation (RAG) enables large language models (LLMs) to incorporate structured knowledge via graph retrieval as contextual input, enhancing more accurate and context-aware reasoning. We observe that for…

Machine Learning · Computer Science 2025-05-20 Qiuyu Zhu , Liang Zhang , Qianxiong Xu , Cheng Long , Jie Zhang

RAGTrace: Understanding and Refining Retrieval-Generation Dynamics in Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) systems have emerged as a promising solution to enhance large language models (LLMs) by integrating external knowledge retrieval with generative capabilities. While significant advancements have been…

Human-Computer Interaction · Computer Science 2025-08-11 Sizhe Cheng , Jiaping Li , Huanchen Wang , Yuxin Ma

Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) is often used with Large Language Models (LLMs) to infuse domain knowledge or user-specific information. In RAG, given a user query, a retriever extracts chunks of relevant text from a knowledge base.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-25 Shubham Agarwal , Sai Sundaresan , Subrata Mitra , Debabrata Mahapatra , Archit Gupta , Rounak Sharma , Nirmal Joshua Kapu , Tong Yu , Shiv Saini

RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving

Retrieval-augmented generation (RAG), which combines large language models (LLMs) with retrievals from external knowledge databases, is emerging as a popular approach for reliable LLM serving. However, efficient RAG serving remains an open…

Information Retrieval · Computer Science 2025-03-24 Wenqi Jiang , Suvinay Subramanian , Cat Graves , Gustavo Alonso , Amir Yazdanbakhsh , Vidushi Dadu

When Retrieval Succeeds and Fails: Rethinking Retrieval-Augmented Generation for LLMs

Large Language Models (LLMs) have enabled a wide range of applications through their powerful capabilities in language understanding and generation. However, as LLMs are trained on static corpora, they face difficulties in addressing…

Computation and Language · Computer Science 2025-10-13 Yongjie Wang , Yue Yu , Kaisong Song , Jun Lin , Zhiqi Shen

Progressive Searching for Retrieval in RAG

Retrieval Augmented Generation (RAG) is a promising technique for mitigating two key limitations of large language models (LLMs): outdated information and hallucinations. RAG system stores documents as embedding vectors in a database. Given…

Information Retrieval · Computer Science 2026-02-10 Taehee Jeong , Xingzhe Zhao , Peizu Li , Markus Valvur , Weihua Zhao

RAG+: Enhancing Retrieval-Augmented Generation with Application-Aware Reasoning

The integration of external knowledge through Retrieval-Augmented Generation (RAG) has become foundational in enhancing large language models (LLMs) for knowledge-intensive tasks. However, existing RAG paradigms often overlook the cognitive…

Artificial Intelligence · Computer Science 2025-09-24 Yu Wang , Shiwan Zhao , Zhihu Wang , Ming Fan , Xicheng Zhang , Yubo Zhang , Zhengfan Wang , Heyuan Huang , Ting Liu

Retrieval-Augmented Generation for Large Language Models: A Survey

Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes. Retrieval-Augmented Generation (RAG) has emerged as a…

Computation and Language · Computer Science 2024-03-28 Yunfan Gao , Yun Xiong , Xinyu Gao , Kangxiang Jia , Jinliu Pan , Yuxi Bi , Yi Dai , Jiawei Sun , Meng Wang , Haofen Wang

Robust Implementation of Retrieval-Augmented Generation on Edge-based Computing-in-Memory Architectures

Large Language Models (LLMs) deployed on edge devices learn through fine-tuning and updating a certain portion of their parameters. Although such learning methods can be optimized to reduce resource utilization, the overall required…

Machine Learning · Computer Science 2024-05-09 Ruiyang Qin , Zheyu Yan , Dewen Zeng , Zhenge Jia , Dancheng Liu , Jianbo Liu , Zhi Zheng , Ningyuan Cao , Kai Ni , Jinjun Xiong , Yiyu Shi

Retrieval Feedback Memory Enhancement Large Model Retrieval Generation Method

Large Language Models (LLMs) have shown remarkable capabilities across diverse tasks, yet they face inherent limitations such as constrained parametric knowledge and high retraining costs. Retrieval-Augmented Generation (RAG) augments the…

Information Retrieval · Computer Science 2025-08-26 Leqian Li , Dianxi Shi , Jialu Zhou , Xinyu Wei , Mingyue Yang , Songchang Jin , Shaowu Yang

Enhancing LLM Intelligence with ARM-RAG: Auxiliary Rationale Memory for Retrieval Augmented Generation

Large Language Models (LLMs) are smart but forgetful. Recent studies, (e.g., (Bubeck et al., 2023)) on modern LLMs have shown that they are capable of performing amazing tasks typically necessitating human-level intelligence. However,…

Computation and Language · Computer Science 2023-11-08 Eric Melz

Does RAG Really Perform Bad For Long-Context Processing?

The efficient processing of long context poses a serious challenge for large language models (LLMs). Recently, retrieval-augmented generation (RAG) has emerged as a promising strategy for this problem, as it enables LLMs to make selective…

Computation and Language · Computer Science 2025-02-18 Kun Luo , Zheng Liu , Peitian Zhang , Hongjin Qian , Jun Zhao , Kang Liu

A Survey on Retrieval-Augmented Text Generation for Large Language Models

Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements to address the static limitations of large language models (LLMs) by enabling the dynamic integration of up-to-date external information. This…

Information Retrieval · Computer Science 2026-05-19 Yizheng Huang , Jimmy Huang

A Systematic Review of Key Retrieval-Augmented Generation (RAG) Systems: Progress, Gaps, and Future Directions

Retrieval-Augmented Generation (RAG) represents a major advancement in natural language processing (NLP), combining large language models (LLMs) with information retrieval systems to enhance factual grounding, accuracy, and contextual…

Computation and Language · Computer Science 2025-07-28 Agada Joseph Oche , Ademola Glory Folashade , Tirthankar Ghosal , Arpan Biswas

Dynamic and Parametric Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) has become a foundational paradigm for equipping large language models (LLMs) with external knowledge, playing a critical role in information retrieval and knowledge-intensive applications. However,…

Computation and Language · Computer Science 2025-06-10 Weihang Su , Qingyao Ai , Jingtao Zhan , Qian Dong , Yiqun Liu

RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval Augmented Generation Systems

Retrieval-Augmented Generation (RAG) mitigates key limitations of Large Language Models (LLMs)-such as factual errors, outdated knowledge, and hallucinations-by dynamically retrieving external information. Recent work extends this paradigm…

Computation and Language · Computer Science 2026-05-22 Jingru Lin , Chen Zhang , Stephen Y. Liu , Haizhou Li

Towards Knowledge Checking in Retrieval-augmented Generation: A Representation Perspective

Retrieval-Augmented Generation (RAG) systems have shown promise in enhancing the performance of Large Language Models (LLMs). However, these systems face challenges in effectively integrating external knowledge with the LLM's internal…

Machine Learning · Computer Science 2024-11-25 Shenglai Zeng , Jiankun Zhang , Bingheng Li , Yuping Lin , Tianqi Zheng , Dante Everaert , Hanqing Lu , Hui Liu , Hui Liu , Yue Xing , Monica Xiao Cheng , Jiliang Tang

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

As one of the most advanced techniques in AI, Retrieval-Augmented Generation (RAG) can offer reliable and up-to-date external knowledge, providing huge convenience for numerous tasks. Particularly in the era of AI-Generated Content (AIGC),…

Computation and Language · Computer Science 2024-06-18 Wenqi Fan , Yujuan Ding , Liangbo Ning , Shijie Wang , Hengyun Li , Dawei Yin , Tat-Seng Chua , Qing Li

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

Single-step retrieval-augmented generation (RAG) provides an efficient way to incorporate external information for simple question answering tasks but struggles with complex questions. Agentic RAG extends this paradigm by replacing…

Computation and Language · Computer Science 2026-05-08 Yijia Zheng , Marcel Worring