Related papers: Reranking with Compressed Document Representation

Efficient Listwise Reranking with Compressed Document Representations

Reranking, the process of refining the output from a first-stage retriever, is often considered computationally expensive, especially when using Large Language Models (LLMs). A common approach to mitigate this cost involves utilizing…

Information Retrieval · Computer Science 2026-04-30 Hervé Déjean , Stéphane Clinchant

Drowning in Documents: Consequences of Scaling Reranker Inference

Rerankers, typically cross-encoders, are computationally intensive but are frequently used because they are widely assumed to outperform cheaper initial IR systems. We challenge this assumption by measuring reranker performance for full…

Information Retrieval · Computer Science 2025-07-14 Mathew Jacob , Erik Lindgren , Matei Zaharia , Michael Carbin , Omar Khattab , Andrew Drozdov

The Evolution of Reranking Models in Information Retrieval: From Heuristic Methods to Large Language Models

Reranking is a critical stage in contemporary information retrieval (IR) systems, improving the relevance of the user-presented final results by honing initial candidate sets. This paper is a thorough guide to examine the changing reranker…

Information Retrieval · Computer Science 2025-12-19 Tejul Pandit , Sakshi Mahendru , Meet Raval , Dhvani Upadhyay

DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) systems combine large language models (LLMs) with external knowledge retrieval, making them highly effective for knowledge-intensive tasks. A crucial but often under-explored component of these systems…

Computation and Language · Computer Science 2025-05-19 Jiashuo Sun , Xianrui Zhong , Sizhe Zhou , Jiawei Han

Layer-wise Token Compression for Efficient Document Reranking

Transformer-based document cross-encoder rerankers are a central component of modern information retrieval systems. Despite their success, these models suffer from high computational costs due to processing long query-document sequences at…

Information Retrieval · Computer Science 2026-05-22 Shengyao Zhuang , Zhichao Xu , Ivano Lauriola

ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking

Reranking is fundamental to information retrieval and retrieval-augmented generation, with recent Large Language Models (LLMs) significantly advancing reranking quality. Most current works rely on large-scale LLMs (>7B parameters),…

Information Retrieval · Computer Science 2026-04-17 Xianming Li , Aamir Shakir , Rui Huang , Tsz-fung Andrew Lee , Julius Lipp , Benjamin Clavié , Jing Li

Beyond Sequential Reranking: Reranker-Guided Search Improves Reasoning Intensive Retrieval

The widely used retrieve-and-rerank pipeline faces two critical limitations: they are constrained by the initial retrieval quality of the top-k documents, and the growing computational demands of LLM-based rerankers restrict the number of…

Information Retrieval · Computer Science 2025-09-10 Haike Xu , Tong Chen

Efficient Document Re-Ranking for Transformers by Precomputing Term Representations

Deep pretrained transformer networks are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their computational expenses deem them cost-prohibitive in practice. Our proposed approach, called…

Information Retrieval · Computer Science 2020-05-27 Sean MacAvaney , Franco Maria Nardini , Raffaele Perego , Nicola Tonellotto , Nazli Goharian , Ophir Frieder

Extreme compression of sentence-transformer ranker models: faster inference, longer battery life, and less storage on edge devices

Modern search systems use several large ranker models with transformer architectures. These models require large computational resources and are not suitable for usage on devices with limited computational resources. Knowledge distillation…

Machine Learning · Computer Science 2022-07-27 Amit Chaulwar , Lukas Malik , Maciej Krajewski , Felix Reichel , Leif-Nissen Lundbæk , Michael Huth , Bartlomiej Matejczyk

Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking

We present a novel approach for training small language models for reasoning-intensive document ranking that combines knowledge distillation with reinforcement learning optimization. While existing methods often rely on expensive human…

Information Retrieval · Computer Science 2025-07-01 Chris Samarinas , Hamed Zamani

RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation

Retrieving documents and prepending them in-context at inference time improves performance of language model (LMs) on a wide range of tasks. However, these documents, often spanning hundreds of words, make inference substantially more…

Computation and Language · Computer Science 2023-10-09 Fangyuan Xu , Weijia Shi , Eunsol Choi

HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse

Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for enhancing the performance of large language models (LLMs) by integrating external knowledge into the generation process. A key component of RAG pipelines is the…

Computation and Language · Computer Science 2025-04-07 Yuwei An , Yihua Cheng , Seo Jin Park , Junchen Jiang

Retrieval or Representation? Reassessing Benchmark Gaps in Multilingual and Visually Rich RAG

Retrieval-augmented generation (RAG) is a common way to ground language models in external documents and up-to-date information. Classical retrieval systems relied on lexical methods such as BM25, which rank documents by term overlap with…

Computation and Language · Computer Science 2026-03-05 Martin Asenov , Kenza Benkirane , Dan Goldwater , Aneiss Ghodsi

RECON: Reasoning with Condensation for Efficient Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) systems trained using reinforcement learning (RL) with reasoning are hampered by inefficient context management, where long, noisy retrieved documents increase costs and degrade performance. We introduce…

Computation and Language · Computer Science 2025-10-14 Zhichao Xu , Minheng Wang , Yawei Wang , Wenqian Ye , Yuntao Du , Yunpu Ma , Yijun Tian

Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG

Ranking models play a crucial role in enhancing overall accuracy of text retrieval systems. These multi-stage systems typically utilize either dense embedding models or sparse lexical indices to retrieve relevant passages based on a given…

Information Retrieval · Computer Science 2024-09-13 Gabriel de Souza P. Moreira , Ronay Ak , Benedikt Schifferer , Mengyao Xu , Radek Osmulski , Even Oldridge

Gumbel Reranking: Differentiable End-to-End Reranker Optimization

RAG systems rely on rerankers to identify relevant documents. However, fine-tuning these models remains challenging due to the scarcity of annotated query-document pairs. Existing distillation-based approaches suffer from training-inference…

Computation and Language · Computer Science 2025-06-10 Siyuan Huang , Zhiyuan Ma , Jintao Du , Changhua Meng , Weiqiang Wang , Jingwen Leng , Minyi Guo , Zhouhan Lin

Rank1: Test-Time Compute for Reranking in Information Retrieval

We introduce Rank1, the first reranking model trained to take advantage of test-time compute. Rank1 demonstrates the applicability within retrieval of using a reasoning language model (i.e. OpenAI's o1, Deepseek's R1, etc.) for distillation…

Information Retrieval · Computer Science 2025-08-11 Orion Weller , Kathryn Ricci , Eugene Yang , Andrew Yates , Dawn Lawrie , Benjamin Van Durme

Don't Forget to Connect! Improving RAG with Graph-based Reranking

Retrieval Augmented Generation (RAG) has greatly improved the performance of Large Language Model (LLM) responses by grounding generation with context from existing documents. These systems work well when documents are clearly relevant to a…

Computation and Language · Computer Science 2024-05-29 Jialin Dong , Bahare Fatemi , Bryan Perozzi , Lin F. Yang , Anton Tsitsulin

ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression

Large language model (LLM) based listwise reranking has emerged as the dominant paradigm for achieving state-of-the-art ranking effectiveness in information retrieval. However, its reliance on feeding full passage texts into the LLM…

Information Retrieval · Computer Science 2026-04-27 Xiaojie Ke , Shuai Zhang , Liansheng Sun , Yongjin Wang , Hengjun Jiang , Xiangkun Liu , Cunxin Gu , Jian Xu , Guanjun Jiang

JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking

Accurate document retrieval is crucial for the success of retrieval-augmented generation (RAG) applications, including open-domain question answering and code completion. While large language models (LLMs) have been employed as dense…

Computation and Language · Computer Science 2024-11-04 Tong Niu , Shafiq Joty , Ye Liu , Caiming Xiong , Yingbo Zhou , Semih Yavuz