Related papers: QuOTE: Question-Oriented Text Embeddings

Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation

We introduce a novel retrieval-augmented generation (RAG) framework tailored for multihop question answering. First, our system uses large language model (LLM) to decompose complex multihop questions into a sequence of single-hop…

Computation and Language · Computer Science 2025-08-14 Seokgi Lee

Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) enhances the factuality of Large Language Models (LLMs) by incorporating retrieved documents and/or generated context. However, LLMs often exhibit a stylistic bias when presented with mixed contexts,…

Computation and Language · Computer Science 2026-04-21 Jiaang Li , Zhendong Mao , Quan Wang , Yuning Wan , Yongdong Zhang

QuIM-RAG: Advancing Retrieval-Augmented Generation with Inverted Question Matching for Enhanced QA Performance

This work presents a novel architecture for building Retrieval-Augmented Generation (RAG) systems to improve Question Answering (QA) tasks from a target corpus. Large Language Models (LLMs) have revolutionized the analyzing and generation…

Computation and Language · Computer Science 2025-01-09 Binita Saha , Utsha Saha , Muhammad Zubair Malik

REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models

Retrieval augmented generation (RAG) pipelines are commonly used in tasks such as question-answering (QA), relying on retrieving relevant documents from a vector store computed using a pretrained embedding model. However, if the retrieved…

Computation and Language · Computer Science 2024-10-18 Ambuje Gupta , Mrinal Rawat , Andreas Stolcke , Roberto Pieraccini

Enhancing Retrieval-Augmented Generation with Topic-Enriched Embeddings: A Hybrid Approach Integrating Traditional NLP Techniques

Retrieval-augmented generation (RAG) systems rely on accurate document retrieval to ground large language models (LLMs) in external knowledge, yet retrieval quality often degrades in corpora where topics overlap and thematic variation is…

Information Retrieval · Computer Science 2026-01-06 Rodrigo Kataishi

QAEncoder: Towards Aligned Representation Learning in Question Answering Systems

Modern QA systems entail retrieval-augmented generation (RAG) for accurate and trustworthy responses. However, the inherent gap between user queries and relevant documents hinders precise matching. We introduce QAEncoder, a training-free…

Computation and Language · Computer Science 2025-07-03 Zhengren Wang , Qinhan Yu , Shida Wei , Zhiyu Li , Feiyu Xiong , Xiaoxing Wang , Simin Niu , Hao Liang , Wentao Zhang

Revolutionizing Retrieval-Augmented Generation with Enhanced PDF Structure Recognition

With the rapid development of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) has become a predominant method in the field of professional knowledge-based question answering. Presently, major foundation model companies…

Artificial Intelligence · Computer Science 2024-01-24 Demiao Lin

Toward Optimal Search and Retrieval for RAG

Retrieval-augmented generation (RAG) is a promising method for addressing some of the memory-related challenges associated with Large Language Models (LLMs). Two separate systems form the RAG pipeline, the retriever and the reader, and the…

Computation and Language · Computer Science 2024-11-13 Alexandria Leto , Cecilia Aguerrebere , Ishwar Bhati , Ted Willke , Mariano Tepper , Vy Ai Vo

Augmenting Question Answering with A Hybrid RAG Approach

Retrieval-Augmented Generation (RAG) has emerged as a powerful technique for enhancing the quality of responses in Question-Answering (QA) tasks. However, existing approaches often struggle with retrieving contextually relevant information,…

Computation and Language · Computer Science 2026-01-27 Tianyi Yang , Nashrah Haque , Vaishnave Jonnalagadda , Yuya Jeremy Ong , Zhehui Chen , Yanzhao Wu , Lei Yu , Divyesh Jadav , Wenqi Wei

Enhancing Question Answering Precision with Optimized Vector Retrieval and Instructions

Question-answering (QA) is an important application of Information Retrieval (IR) and language models, and the latest trend is toward pre-trained large neural networks with embedding parameters. Augmenting QA performances with these LLMs…

Information Retrieval · Computer Science 2024-11-05 Lixiao Yang , Mengyang Xu , Weimao Ke

Chunk Twice, Embed Once: A Systematic Study of Segmentation and Representation Trade-offs in Chemistry-Aware Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) systems are increasingly vital for navigating the ever-expanding body of scientific literature, particularly in high-stakes domains such as chemistry. Despite the promise of RAG, foundational design…

Information Retrieval · Computer Science 2025-06-24 Mahmoud Amiri , Thomas Bocklitz

QuOTeS: Query-Oriented Technical Summarization

Abstract. When writing an academic paper, researchers often spend considerable time reviewing and summarizing papers to extract relevant citations and data to compose the Introduction and Related Work sections. To address this problem, we…

Information Retrieval · Computer Science 2023-06-22 Juan Ramirez-Orta , Eduardo Xamena , Ana Maguitman , Axel J. Soto , Flavia P. Zanoto , Evangelos Milios

Emulating Retrieval Augmented Generation via Prompt Engineering for Enhanced Long Context Comprehension in LLMs

This paper addresses the challenge of comprehending very long contexts in Large Language Models (LLMs) by proposing a method that emulates Retrieval Augmented Generation (RAG) through specialized prompt engineering and chain-of-thought…

Computation and Language · Computer Science 2025-02-19 Joon Park , Kyohei Atarashi , Koh Takeuchi , Hisashi Kashima

KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs

Retrieval-Augmented Generation (RAG) improves factual accuracy by grounding responses in external knowledge. However, existing RAG methods either rely solely on text corpora and neglect structural knowledge, or build ad-hoc knowledge graphs…

Computation and Language · Computer Science 2025-10-21 Dingjun Wu , Yukun Yan , Zhenghao Liu , Zhiyuan Liu , Maosong Sun

LLM-Assisted Question-Answering on Technical Documents Using Structured Data-Aware Retrieval Augmented Generation

Large Language Models (LLMs) are capable of natural language understanding and generation. But they face challenges such as hallucination and outdated knowledge. Fine-tuning is one possible solution, but it is resource-intensive and must be…

Computation and Language · Computer Science 2025-07-01 Shadman Sobhan , Mohammad Ariful Haque

ELITE: Embedding-Less retrieval with Iterative Text Exploration

Large Language Models (LLMs) have achieved impressive progress in natural language processing, but their limited ability to retain long-term context constrains performance on document-level or multi-turn tasks. Retrieval-Augmented…

Computation and Language · Computer Science 2025-05-20 Zhangyu Wang , Siyuan Gao , Rong Zhou , Hao Wang , Li Ning

AccurateRAG: A Framework for Building Accurate Retrieval-Augmented Question-Answering Applications

We introduce AccurateRAG -- a novel framework for constructing high-performance question-answering applications based on retrieval-augmented generation (RAG). Our framework offers a pipeline for development efficiency with tools for raw…

Computation and Language · Computer Science 2026-03-04 Linh The Nguyen , Chi Tran , Dung Ngoc Nguyen , Van-Cuong Pham , Hoang Ngo , Dat Quoc Nguyen

Text Embeddings for Retrieval From a Large Knowledge Base

Text embedding representing natural language documents in a semantic vector space can be used for document retrieval using nearest neighbor lookup. In order to study the feasibility of neural models specialized for retrieval in a…

Information Retrieval · Computer Science 2019-05-03 Tolgahan Cakaloglu , Christian Szegedy , Xiaowei Xu

Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines

Retrieval-augmented generation (RAG) has emerged to address the knowledge-intensive visual question answering (VQA) task. Current methods mainly employ separate retrieval and generation modules to acquire external knowledge and generate…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Xinwei Long , Zhiyuan Ma , Ermo Hua , Kaiyan Zhang , Biqing Qi , Bowen Zhou

MODE: Mixture of Document Experts for RAG

Retrieval-Augmented Generation (RAG) often relies on large vector databases and cross-encoders tuned for large-scale corpora, which can be excessive for small, domain-specific collections. We present MODE (Mixture of Document Experts), a…

Artificial Intelligence · Computer Science 2025-09-03 Rahul Anand