English
Related papers

Related papers: REFRAG: Rethinking RAG based Decoding

200 papers

Retrieval Augmented Generation (RAG) has emerged as a crucial technique for enhancing the accuracy of Large Language Models (LLMs) by incorporating external information. With the advent of LLMs that support increasingly longer context…

Machine Learning · Computer Science 2024-11-07 Quinn Leng , Jacob Portes , Sam Havens , Matei Zaharia , Michael Carbin

The efficient processing of long context poses a serious challenge for large language models (LLMs). Recently, retrieval-augmented generation (RAG) has emerged as a promising strategy for this problem, as it enables LLMs to make selective…

Computation and Language · Computer Science 2025-02-18 Kun Luo , Zheng Liu , Peitian Zhang , Hongjin Qian , Jun Zhao , Kang Liu

Retrieval-augmented generation (RAG) empowers large language models (LLMs) to utilize external knowledge sources. The increasing capacity of LLMs to process longer input sequences opens up avenues for providing more retrieved information,…

Computation and Language · Computer Science 2024-10-10 Bowen Jin , Jinsung Yoon , Jiawei Han , Sercan O. Arik

Large language models (LLMs) augmented with retrieval exhibit robust performance and extensive versatility by incorporating external contexts. However, the input length grows linearly in the number of retrieved documents, causing a dramatic…

Computation and Language · Computer Science 2024-05-28 Yun Zhu , Jia-Chen Gu , Caitlin Sikora , Ho Ko , Yinxiao Liu , Chu-Cheng Lin , Lei Shu , Liangchen Luo , Lei Meng , Bang Liu , Jindong Chen

The emergence of long-context large language models (LLMs) offers a promising alternative to traditional retrieval-augmented generation (RAG) for processing extensive documents. However, the computational overhead of long-context inference…

Computation and Language · Computer Science 2025-06-24 Guanzheng Chen , Qilong Feng , Jinjie Ni , Xin Li , Michael Qizhe Shieh

Processing long contexts presents a significant challenge for large language models (LLMs). While recent advancements allow LLMs to handle much longer contexts than before (e.g., 32K or 128K tokens), it is computationally expensive and can…

Computation and Language · Computer Science 2025-04-10 Hongjin Qian , Zheng Liu , Peitian Zhang , Kelong Mao , Defu Lian , Zhicheng Dou , Tiejun Huang

Retrieval-augmented generation (RAG) has emerged as an approach to augment large language models (LLMs) by reducing their reliance on static knowledge and improving answer factuality. RAG retrieves relevant context snippets and generates an…

Computation and Language · Computer Science 2025-02-21 Juraj Vladika , Florian Matthes

The existing Retrieval-Augmented Generation (RAG) systems face significant challenges in terms of cost and effectiveness. On one hand, they need to encode the lengthy retrieved contexts before responding to the input tasks, which imposes…

Computation and Language · Computer Science 2024-09-25 Zheng Liu , Chenyuan Wu , Ninglu Shao , Shitao Xiao , Chaozhuo Li , Defu Lian

Retrieval-Augmented Generation (RAG) has been shown to enhance the factual accuracy of Large Language Models (LLMs), but existing methods often suffer from limited reasoning capabilities in effectively using the retrieved evidence,…

Computation and Language · Computer Science 2024-10-03 Shayekh Bin Islam , Md Asib Rahman , K S M Tozammel Hossain , Enamul Hoque , Shafiq Joty , Md Rizwan Parvez

Overcoming the limited context limitations in early-generation LLMs, retrieval-augmented generation (RAG) has been a reliable solution for context-based answer generation in the past. Recently, the emergence of long-context LLMs allows the…

Computation and Language · Computer Science 2024-09-04 Tan Yu , Anbang Xu , Rama Akkiraju

Large Language Models (LLMs) showcase remarkable abilities, yet they struggle with limitations such as hallucinations, outdated knowledge, opacity, and inexplicable reasoning. To address these challenges, Retrieval-Augmented Generation…

Computation and Language · Computer Science 2024-10-03 Sourav Verma

Retrieval-Augmented Generation (RAG) has become an essential approach for extending the reasoning and knowledge capacity of large language models (LLMs). While prior research has primarily focused on retrieval quality and prompting…

Computation and Language · Computer Science 2025-12-09 Jiamin Chen , Yuchen Li , Xinyu Ma , Xinran Chen , Xiaokun Zhang , Shuaiqiang Wang , Chen Ma , Dawei Yin

The scaling of inference computation has unlocked the potential of long-context large language models (LLMs) across diverse settings. For knowledge-intensive tasks, the increased compute is often allocated to incorporate more external…

Computation and Language · Computer Science 2025-03-04 Zhenrui Yue , Honglei Zhuang , Aijun Bai , Kai Hui , Rolf Jagerman , Hansi Zeng , Zhen Qin , Dong Wang , Xuanhui Wang , Michael Bendersky

Effectively incorporating external knowledge into Large Language Models (LLMs) is crucial for enhancing their capabilities and addressing real-world needs. Retrieval-Augmented Generation (RAG) offers an effective method for achieving this…

Computation and Language · Computer Science 2025-03-06 Kuan Li , Liwen Zhang , Yong Jiang , Pengjun Xie , Fei Huang , Shuai Wang , Minhao Cheng

Retrieval-augmented generation (RAG) with large language models (LLMs) has demonstrated strong performance in multilingual question-answering (QA) tasks by leveraging relevant passages retrieved from corpora. In multilingual RAG (mRAG), the…

Computation and Language · Computer Science 2025-12-12 Jirui Qi , Raquel Fernández , Arianna Bisazza

Extending context windows (i.e., Long Context, LC) and using retrievers to selectively access relevant information (i.e., Retrieval-Augmented Generation, RAG) are the two main strategies to enable LLMs to incorporate extremely long external…

Computation and Language · Computer Science 2025-01-06 Xinze Li , Yixin Cao , Yubo Ma , Aixin Sun

Large Language Models (LLMs) exhibit remarkable capabilities but are prone to generating inaccurate or hallucinatory responses. This limitation stems from their reliance on vast pretraining datasets, making them susceptible to errors in…

Computation and Language · Computer Science 2024-04-02 Chi-Min Chan , Chunpu Xu , Ruibin Yuan , Hongyin Luo , Wei Xue , Yike Guo , Jie Fu

Retrieval Augmented Generation (RAG) has been a powerful tool for Large Language Models (LLMs) to efficiently process overly lengthy contexts. However, recent LLMs like Gemini-1.5 and GPT-4 show exceptional capabilities to understand long…

Computation and Language · Computer Science 2024-10-18 Zhuowan Li , Cheng Li , Mingyang Zhang , Qiaozhu Mei , Michael Bendersky

Retrieval-augmented generation (RAG) frameworks enable large language models (LLMs) to retrieve relevant information from a knowledge base and incorporate it into the context for generating responses. This mitigates hallucinations and…

Computation and Language · Computer Science 2024-04-09 Pouria Rouzrokh , Shahriar Faghani , Cooper U. Gamble , Moein Shariatnia , Bradley J. Erickson

Retrieval-Augmented Generation (RAG) has become a widely adopted paradigm for enhancing the reliability of large language models (LLMs). However, RAG systems are sensitive to retrieval strategies that rely on text chunking to construct…

Information Retrieval · Computer Science 2026-03-31 Sun Xu , Tongkai Xu , Baiheng Xie , Li Huang , Qiang Gao , Kunpeng Zhang
‹ Prev 1 2 3 10 Next ›