English
Related papers

Related papers: LEMUR: Learned Multi-Vector Retrieval

200 papers

Late interaction retrieval methods, pioneered by ColBERT, have emerged as a powerful alternative to single-vector neural IR. By leveraging fine-grained, token-level representations, they have been demonstrated to deliver strong…

Information Retrieval · Computer Science 2025-11-04 Benjamin Clavié , Xianming Li , Antoine Chaffin , Omar Khattab , Tom Aarsen , Manuel Faysse , Jing Li

Multi-modal retrieval has seen tremendous progress with the development of vision-language models. However, further improving these models require additional labelled data which is a huge manual effort. In this paper, we propose a framework…

Computer Vision and Pattern Recognition · Computer Science 2023-09-26 Avinash Madasu , Estelle Aflalo , Gabriela Ben Melech Stan , Shachar Rosenman , Shao-Yen Tseng , Gedas Bertasius , Vasudev Lal

Dense retrieval models usually adopt vectors from the last hidden layer of the document encoder to represent a document, which is in contrast to the fact that representations in different layers of a pre-trained language model usually…

Information Retrieval · Computer Science 2025-09-30 Zhongbin Xie , Thomas Lukasiewicz

ColBERT introduced a late interaction mechanism that independently encodes queries and documents using BERT, and computes similarity via fine-grained interactions over token-level vector representations. This design enables expressive…

Information Retrieval · Computer Science 2025-11-21 Archish S , Ankit Garg , Kirankumar Shiragur , Neeraj Kayal

Neural embedding models have become a fundamental component of modern information retrieval (IR) pipelines. These models produce a single embedding $x \in \mathbb{R}^d$ per data-point, allowing for fast retrieval via highly optimized…

Data Structures and Algorithms · Computer Science 2024-05-31 Laxman Dhulipala , Majid Hadian , Rajesh Jayaram , Jason Lee , Vahab Mirrokni

Large language models (LLMs) are increasingly used to access legal information. Yet, their deployment in multilingual legal settings is constrained by unreliable retrieval and the lack of domain-adapted, open-embedding models. In…

Computation and Language · Computer Science 2026-02-11 Narges Baba Ahmadi , Jan Strich , Martin Semmann , Chris Biemann

Cross-modal retrieval is gaining increasing efficacy and interest from the research community, thanks to large-scale training, novel architectural and learning designs, and its application in LLMs and multimodal LLMs. In this paper, we move…

Computer Vision and Pattern Recognition · Computer Science 2025-03-05 Davide Caffagni , Sara Sarto , Marcella Cornia , Lorenzo Baraldi , Rita Cucchiara

This paper introduces Sparsified Late Interaction for Multi-vector (SLIM) retrieval with inverted indexes. Multi-vector retrieval methods have demonstrated their effectiveness on various retrieval datasets, and among them, ColBERT is the…

Information Retrieval · Computer Science 2023-05-10 Minghan Li , Sheng-Chieh Lin , Xueguang Ma , Jimmy Lin

Recent progress in Natural Language Understanding (NLU) is driving fast-paced advances in Information Retrieval (IR), largely owed to fine-tuning deep language models (LMs) for document ranking. While remarkably effective, the ranking…

Information Retrieval · Computer Science 2020-06-05 Omar Khattab , Matei Zaharia

Traditional ID-based recommender systems often struggle with cold-start and generalization challenges. Multimodal recommendation systems, which leverage textual and visual data, offer a promising solution to mitigate these issues. However,…

As data retrieval demands become increasingly complex, traditional search methods often fall short in addressing nuanced and conceptual queries. Vector similarity search has emerged as a promising technique for finding semantically similar…

Artificial Intelligence · Computer Science 2024-12-31 Md Riyadh , Muqi Li , Felix Haryanto Lie , Jia Long Loh , Haotian Mi , Sayam Bohra

Most text retrievers generate \emph{one} query vector to retrieve relevant documents. Yet, the conditional distribution of relevant documents for the query may be multimodal, e.g., representing different interpretations of the query. We…

Computation and Language · Computer Science 2025-11-05 Hung-Ting Chen , Xiang Liu , Shauli Ravfogel , Eunsol Choi

In multi-vector retrieval, both queries and data are represented as sets of high-dimensional vectors, enabling finer-grained semantic matching and improving retrieval quality over single-vector approaches. However, its practical adoption is…

Information Retrieval · Computer Science 2026-03-24 Yao Tian , Zhoujin Tian , Xi Zhao , Ruiyuan Zhang , Xiaofang Zhou

Traditional retrieval methods have been essential for assessing document similarity but struggle with capturing semantic nuances. Despite advancements in latent semantic analysis (LSA) and deep learning, achieving comprehensive semantic…

Information Retrieval · Computer Science 2024-09-27 Solmaz Seyed Monir , Irene Lau , Shubing Yang , Dongfang Zhao

Neural networks are the backbone of modern artificial intelligence, but designing, evaluating, and comparing them remains labor-intensive. While numerous datasets exist for training, there are few standardized collections of the models…

Vector embeddings from pre-trained language models form a core component in Neural Information Retrieval systems across a multitude of knowledge extraction tasks. The paradigm of late interaction, introduced in ColBERT, demonstrates high…

Information Retrieval · Computer Science 2026-03-27 Raj Nath Patel , Sourav Dutta

Multi-vector retrieval methods, exemplified by the ColBERT architecture, have shown substantial promise for retrieval by providing strong trade-offs in terms of retrieval latency and effectiveness. However, they come at a high cost in terms…

Information Retrieval · Computer Science 2025-04-03 Sean MacAvaney , Antonio Mallia , Nicola Tonellotto

Visual document retrieval has become essential for accessing information in visually rich documents. Existing approaches fall into two camps. Late-interaction retrievers achieve strong quality through fine-grained token-level matching but…

Machine Learning · Computer Science 2026-05-08 Weien Li , Rui Song , Zeyu Li , Haochen Liu , Gonghao Zhang , Difan Jiao , Zhenwei Tang , Bowei He , Haolun Wu , Xue Liu , Ye Yuan

State-of-the-art retrieval models typically address a straightforward search scenario, in which retrieval tasks are fixed (e.g., finding a passage to answer a specific question) and only a single modality is supported for both queries and…

Computation and Language · Computer Science 2025-02-25 Sheng-Chieh Lin , Chankyu Lee , Mohammad Shoeybi , Jimmy Lin , Bryan Catanzaro , Wei Ping

While (large) language models have significantly improved over the last years, they still struggle to sensibly process long sequences found, e.g., in books, due to the quadratic scaling of the underlying attention mechanism. To address…

Computation and Language · Computer Science 2024-06-14 Tamara Czinczoll , Christoph Hönes , Maximilian Schall , Gerard de Melo
‹ Prev 1 2 3 10 Next ›