Related papers: ESPN: Memory-Efficient Multi-Vector Information Re…

LEANN: A Low-Storage Vector Index

Embedding-based vector search underpins many important applications, such as recommendation and retrieval-augmented generation (RAG). It relies on vector indices to enable efficient search. However, these indices require storing…

Databases · Computer Science 2025-11-26 Yichuan Wang , Zhifei Li , Shu Liu , Yongji Wu , Ziming Mao , Yilong Zhao , Xiao Yan , Zhiying Xu , Yang Zhou , Ion Stoica , Sewon Min , Matei Zaharia , Joseph E. Gonzalez

Your Embedding Model is SMARTer Than You Think

Multimodal retrieval relies heavily on single-vector retrievers, which compress rich, sequential token sequences into one single global representation. While efficient, they discard fine-grained, local evidence critical for dense retrieval…

Information Retrieval · Computer Science 2026-05-26 Jianrui Zhang , Hyun Jung Lee , Sukanta Ganguly , Tae-Eui Kam , Donghyun Kim , Yong Jae Lee

MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings

Neural embedding models have become a fundamental component of modern information retrieval (IR) pipelines. These models produce a single embedding $x \in \mathbb{R}^d$ per data-point, allowing for fast retrieval via highly optimized…

Data Structures and Algorithms · Computer Science 2024-05-31 Laxman Dhulipala , Majid Hadian , Rajesh Jayaram , Jason Lee , Vahab Mirrokni

Model-enhanced Vector Index

Embedding-based retrieval methods construct vector indices to search for document representations that are most similar to the query representations. They are widely used in document retrieval due to low latency and decent recall…

Information Retrieval · Computer Science 2023-11-10 Hailin Zhang , Yujing Wang , Qi Chen , Ruiheng Chang , Ting Zhang , Ziming Miao , Yingyan Hou , Yang Ding , Xupeng Miao , Haonan Wang , Bochen Pang , Yuefeng Zhan , Hao Sun , Weiwei Deng , Qi Zhang , Fan Yang , Xing Xie , Mao Yang , Bin Cui

Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings

Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations. Experience replay improves sample-efficiency by reusing experiences from the past, and convolutional…

Machine Learning · Computer Science 2021-10-29 Lili Chen , Kimin Lee , Aravind Srinivas , Pieter Abbeel

Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval

Image retrieval is crucial in robotics and computer vision, with downstream applications in robot place recognition and vision-based product recommendations. Modern retrieval systems face two key challenges: scalability and efficiency.…

Information Retrieval · Computer Science 2025-04-03 Mohammad Omama , Po-han Li , Sandeep P. Chinchali

Memory-Efficient Sequential Pattern Mining with Hybrid Tries

This paper develops a memory-efficient approach for Sequential Pattern Mining (SPM), a fundamental topic in knowledge discovery that faces a well-known memory bottleneck for large data sets. Our methodology involves a novel hybrid trie data…

Databases · Computer Science 2024-07-30 Amin Hosseininasab , Willem-Jan van Hoeve , Andre A. Cire

SOLAR: Sparse Orthogonal Learned and Random Embeddings

Dense embedding models are commonly deployed in commercial search engines, wherein all the document vectors are pre-computed, and near-neighbor search (NNS) is performed with the query vector to find relevant documents. However, the…

Machine Learning · Computer Science 2020-09-01 Tharun Medini , Beidi Chen , Anshumali Shrivastava

Neural Input Search for Large Scale Recommendation Models

Recommendation problems with large numbers of discrete items, such as products, webpages, or videos, are ubiquitous in the technology industry. Deep neural networks are being increasingly used for these recommendation problems. These models…

Machine Learning · Computer Science 2019-07-11 Manas R. Joglekar , Cong Li , Jay K. Adams , Pranav Khaitan , Quoc V. Le

MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction

Universal multimodal embedding models have achieved great success in capturing semantic relevance between queries and candidates. However, current methods either condense queries and candidates into a single vector, potentially limiting the…

Information Retrieval · Computer Science 2026-04-08 Zilin Xiao , Qi Ma , Mengting Gu , Chun-cheng Jason Chen , Xintao Chen , Vicente Ordonez , Vijai Mohan

Semantic Certainty Assessment in Vector Retrieval Systems: A Novel Framework for Embedding Quality Evaluation

Vector retrieval systems exhibit significant performance variance across queries due to heterogeneous embedding quality. We propose a lightweight framework for predicting retrieval performance at the query level by combining quantization…

Information Retrieval · Computer Science 2025-07-09 Y. Du

ReinPool: Reinforcement Learning Pooling Multi-Vector Embeddings for Retrieval System

Multi-vector embedding models have emerged as a powerful paradigm for document retrieval, preserving fine-grained visual and textual details through token-level representations. However, this expressiveness comes at a staggering cost:…

Information Retrieval · Computer Science 2026-01-13 Sungguk Cha , DongWook Kim , Mintae Kim , Youngsub Han , Byoung-Ki Jeon , Sangyeob Lee

Accurate and efficient protein embedding using multi-teacher distillation learning

Motivation: Protein embedding, which represents proteins as numerical vectors, is a crucial step in various learning-based protein annotation/classification problems, including gene ontology prediction, protein-protein interaction…

Genomics · Quantitative Biology 2024-05-21 Jiayu Shang , Cheng Peng , Yongxin Ji , Jiaojiao Guan , Dehan Cai , Xubo Tang , Yanni Sun

Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

Over the last few years, multi-vector retrieval methods, spearheaded by ColBERT, have become an increasingly popular approach to Neural IR. By storing representations at the token level rather than at the document level, these methods have…

Information Retrieval · Computer Science 2024-09-24 Benjamin Clavié , Antoine Chaffin , Griffin Adams

Efficient Multi-Vector Dense Retrieval Using Bit Vectors

Dense retrieval techniques employ pre-trained large language models to build a high-dimensional representation of queries and passages. These representations compute the relevance of a passage w.r.t. to a query using efficient similarity…

Information Retrieval · Computer Science 2024-04-04 Franco Maria Nardini , Cosimo Rulli , Rossano Venturini

VectorSearch: Enhancing Document Retrieval with Semantic Embeddings and Optimized Search

Traditional retrieval methods have been essential for assessing document similarity but struggle with capturing semantic nuances. Despite advancements in latent semantic analysis (LSA) and deep learning, achieving comprehensive semantic…

Information Retrieval · Computer Science 2024-09-27 Solmaz Seyed Monir , Irene Lau , Shubing Yang , Dongfang Zhao

Efficient Sparse Processing-in-Memory Architecture (ESPIM) for Machine Learning Inference

Emerging machine learning (ML) models (e.g., transformers) involve memory pin bandwidth-bound matrix-vector (MV) computation in inference. By avoiding pin crossings, processing in memory (PIM) can improve performance and energy for…

Hardware Architecture · Computer Science 2024-04-09 Mingxuan He , Mithuna Thottethodi , T. N. Vijaykumar

Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations

Learned sparse representations form an attractive class of contextual embeddings for text retrieval. That is so because they are effective models of relevance and are interpretable by design. Despite their apparent compatibility with…

Information Retrieval · Computer Science 2024-07-15 Sebastian Bruch , Franco Maria Nardini , Cosimo Rulli , Rossano Venturini

SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search

The in-memory algorithms for approximate nearest neighbor search (ANNS) have achieved great success for fast high-recall search, but are extremely expensive when handling very large scale database. Thus, there is an increasing request for…

Databases · Computer Science 2021-11-17 Qi Chen , Bing Zhao , Haidong Wang , Mingqin Li , Chuanjie Liu , Zengzhong Li , Mao Yang , Jingdong Wang

Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance

Embedding models have become essential tools in both natural language processing and computer vision, enabling efficient semantic search, recommendation, clustering, and more. However, the high memory and computational demands of…

Computation and Language · Computer Science 2024-11-26 Jiayi Chen , Chen Wu , Shaoqun Zhang , Nan Li , Liangjie Zhang , Qi Zhang