English
Related papers

Related papers: Efficient Document Ranking with Learnable Late Int…

200 papers

Cross-encoders deliver state-of-the-art ranking effectiveness in information retrieval, but have a high inference cost. This prevents them from being used as first-stage rankers, but also incurs a cost when re-ranking documents. Prior work…

Information Retrieval · Computer Science 2026-03-04 Mathias Vast , Victor Morand , Basile van Cooten , Laure Soulier , Josiane Mothe , Benjamin Piwowarski

Vector embeddings from pre-trained language models form a core component in Neural Information Retrieval systems across a multitude of knowledge extraction tasks. The paradigm of late interaction, introduced in ColBERT, demonstrates high…

Information Retrieval · Computer Science 2026-03-27 Raj Nath Patel , Sourav Dutta

Recent progress in Natural Language Understanding (NLU) is driving fast-paced advances in Information Retrieval (IR), largely owed to fine-tuning deep language models (LMs) for document ranking. While remarkably effective, the ranking…

Information Retrieval · Computer Science 2020-06-05 Omar Khattab , Matei Zaharia

Multi-vector dense models, such as ColBERT, have proven highly effective in information retrieval. ColBERT's late interaction scoring approximates the joint query-document attention seen in cross-encoders while maintaining inference…

Dense encoders and LLM-based rerankers struggle with long documents: single-vector representations dilute fine-grained relevance, while cross-encoders are often too expensive for practical reranking. We present an efficient long-document…

Information Retrieval · Computer Science 2026-02-06 Minghan Li , Eric Gaussier , Guodong Zhou

With the development of pre-trained language models, the dense retrieval models have become promising alternatives to the traditional retrieval models that rely on exact match and sparse bag-of-words representations. Different from most…

Information Retrieval · Computer Science 2024-03-21 Qi Liu , Gang Guo , Jiaxin Mao , Zhicheng Dou , Ji-Rong Wen , Hao Jiang , Xinyu Zhang , Zhao Cao

Search systems are increasingly used for reasoning-intensive queries, where what makes a document relevant requires understanding or reasoning over the query-document relation rather than relying on surface vocabulary or topical similarity.…

Information Retrieval · Computer Science 2026-05-27 Nilesh Gupta , Wei-Cheng Chang , Ngot Bui , Cho-Jui Hsieh , Inderjit S. Dhillon

Reliable biomedical and clinical retrieval requires more than strong ranking performance: it requires a practical way to find systematic model failures and curate the training evidence needed to correct them. Late-interaction models such as…

Information Retrieval · Computer Science 2026-04-22 François Remy

Transformer-based models such as BERT and E5 have significantly advanced text embedding by capturing rich contextual representations. However, many complex real-world queries require sophisticated reasoning to retrieve relevant documents…

Computation and Language · Computer Science 2025-09-03 Yuxiang Liu , Tian Wang , Gourab Kundu , Tianyu Cao , Guang Cheng , Zhen Ge , Jianshu Chen , Qingjun Cui , Trishul Chilimbi

The late interaction paradigm introduced with ColBERT stands out in the neural Information Retrieval space, offering a compelling effectiveness-efficiency trade-off across many benchmarks. Efficient late interaction retrieval is based on an…

Information Retrieval · Computer Science 2024-04-23 Thibault Formal , Stéphane Clinchant , Hervé Déjean , Carlos Lassance

This paper describes a compact and effective model for low-latency passage retrieval in conversational search based on learned dense representations. Prior to our work, the state-of-the-art approach uses a multi-stage pipeline comprising…

Information Retrieval · Computer Science 2021-11-30 Sheng-Chieh Lin , Jheng-Hong Yang , Jimmy Lin

State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms,…

Computation and Language · Computer Science 2022-04-26 Kai Hui , Honglei Zhuang , Tao Chen , Zhen Qin , Jing Lu , Dara Bahri , Ji Ma , Jai Prakash Gupta , Cicero Nogueira dos Santos , Yi Tay , Don Metzler

Cross encoders (CEs) are trained with sentence pairs to detect relatedness. As CEs require sentence pairs at inference, the prevailing view is that they can only be used as re-rankers in information retrieval pipelines. Dual encoders (DEs)…

Computation and Language · Computer Science 2025-02-07 Haritha Ananthakrishnan , Julian Dolby , Harsha Kokel , Horst Samulowitz , Kavitha Srinivas

Recent advances in dense retrieval techniques have offered the promise of being able not just to re-rank documents using contextualised language models such as BERT, but also to use such models to identify documents from the collection in…

Information Retrieval · Computer Science 2021-08-25 Nicola Tonellotto , Craig Macdonald

Learning clients embeddings from sequences of their historic communications is central to financial applications. While large language models (LLMs) offer general world knowledge, their direct use on long event sequences is computationally…

Text embedding models serve as a fundamental component in real-world search applications. By mapping queries and documents into a shared embedding space, they deliver competitive retrieval performance with high efficiency. However, their…

Computation and Language · Computer Science 2025-11-03 Qi Liu , Yanzhao Zhang , Mingxin Li , Dingkun Long , Pengjun Xie , Jiaxin Mao

Late-interaction retrieval models like ColBERT achieve superior accuracy by enabling token-level interactions, but their computational cost hinders scalability and integration with Approximate Nearest Neighbor Search (ANNS). We introduce…

Information Retrieval · Computer Science 2026-01-15 Ramnath Kumar , Prateek Jain , Cho-Jui Hsieh

BERT based ranking models have achieved superior performance on various information retrieval tasks. However, the large number of parameters and complex self-attention operation come at a significant latency overhead. To remedy this, recent…

Information Retrieval · Computer Science 2021-10-06 Nachshon Cohen , Amit Portnoy , Besnik Fetahu , Amir Ingber

We present an approach to ranking with dense representations that applies knowledge distillation to improve the recently proposed late-interaction ColBERT model. Specifically, we distill the knowledge from ColBERT's expressive MaxSim…

Information Retrieval · Computer Science 2020-10-23 Sheng-Chieh Lin , Jheng-Hong Yang , Jimmy Lin

In this paper, we consider the problem of improving the inference latency of language model-based dense retrieval systems by introducing structural compression and model size asymmetry between the context and query encoders. First, we…

Computation and Language · Computer Science 2023-06-05 Daniel Campos , Alessandro Magnani , ChengXiang Zhai
‹ Prev 1 2 3 10 Next ›