Related papers: Approximate search with quantized sparse represent…
In this paper, we address the problem of searching for semantically similar images from a large database. We present a compact coding approach, supervised quantization. Our approach simultaneously learns feature selection that linearly…
This paper proposes a binarization scheme for vectors of high dimension based on the recent concept of anti-sparse coding, and shows its excellent performance for approximate nearest neighbor search. Unlike other binarization schemes, this…
Representing signals with sparse vectors has a wide range of applications that range from image and video coding to shape representation and health monitoring. In many applications with real-time requirements, or that deal with…
Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional feature indexing and retrieval is the crux of large-scale image retrieval. A recent promising technique is Product Quantization, which attempts to index…
We consider a similarity measure between two sets $A$ and $B$ of vectors, that balances the average and maximum cosine distance between pairs of vectors, one from set $A$ and one from set $B$. As a motivation for this measure, we present…
This paper addresses the problem of Approximate Nearest Neighbor (ANN) search in pattern recognition where feature vectors in a database are encoded as compact codes in order to speed-up the similarity search in large-scale databases.…
This paper studies the compact coding approach to approximate nearest neighbor search. We introduce a composite quantization framework. It uses the composition of several ($M$) elements, each of which is selected from a different…
Sparse representation-based classifiers have shown outstanding accuracy and robustness in image classification tasks even with the presence of intense noise and occlusion. However, it has been discovered that the performance degrades…
Maximum Inner Product Search or top-k retrieval on sparse vectors is well-understood in information retrieval, with a number of mature algorithms that solve it exactly. However, all existing algorithms are tailored to text and…
Many modern search domains comprise high-dimensional vectors of floating point numbers derived from neural networks, in the form of embeddings. Typical embeddings range in size from hundreds to thousands of dimensions, making the size of…
We solve the analysis sparse coding problem considering a combination of convex and non-convex sparsity promoting penalties. The multi-penalty formulation results in an iterative algorithm involving proximal-averaging. We then unfold the…
Nearest neighbor search is a very active field in machine learning for it appears in many application cases, including classification and object retrieval. In its canonical version, the complexity of the search is linear with both the…
Sparse coding consists in representing signals as sparse linear combinations of atoms selected from a dictionary. We consider an extension of this framework where the atoms are further assumed to be embedded in a tree. This is achieved…
Similarity search retrieves the nearest neighbors of a query vector from a dataset of high-dimensional vectors. As the size of the dataset grows, the cost of performing the distance computations needed to implement a query can become…
The ultimate goal of any sparse coding method is to accurately recover from a few noisy linear measurements, an unknown sparse vector. Unfortunately, this estimation problem is NP-hard in general, and it is therefore always approached with…
This paper revisits cluster-based retrieval that partitions the inverted index into multiple groups and skips the index partially at cluster and document levels during online inference using a learned sparse representation. It proposes an…
Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements. In this paper, we revisit these approaches by considering, additionally, the memory…
Approximate nearest neighbor search for vectors relies on indexes that are most often accessed from RAM. Therefore, storage is the factor limiting the size of the database that can be served from a machine. Lossy vector compression, i.e.,…
Nowadays, data is represented by vectors. Retrieving those vectors, among millions and billions, that are similar to a given query is a ubiquitous problem, known as similarity search, of relevance for a wide range of applications.…
In Compressed Sensing, a real-valued sparse vector has to be estimated from an underdetermined system of linear equations. In many applications, however, the elements of the sparse vector are drawn from a finite set. For the estimation of…