Related papers: Bucketing Coding and Information Theory for the St…

Fast Search on Binary Codes by Weighted Hamming Distance

Weighted Hamming distance, as a similarity measure between binary codes and binary queries, provides superior accuracy in search tasks than Hamming distance. However, how to efficiently and accurately find $K$ binary codes that have the…

Computer Vision and Pattern Recognition · Computer Science 2021-08-11 Zhenyu Weng , Yuesheng Zhu , Ruixin Liu

A Scheme for Approximating Probabilistic Inference

This paper describes a class of probabilistic approximation algorithms based on bucket elimination which offer adjustable levels of accuracy and efficiency. We analyze the approximation for several tasks: finding the most probable…

Artificial Intelligence · Computer Science 2013-02-08 Rina Dechter , Irina Rish

Bounding sets of sequential quantum correlations and device-independent randomness certification

An important problem in quantum information theory is that of bounding sets of correlations that arise from making local measurements on entangled states of arbitrary dimension. Currently, the best-known method to tackle this problem is the…

Quantum Physics · Physics 2020-10-21 Joseph Bowles , Flavio Baccari , Alexia Salavrakos

Given a large dataset of binary codes and a binary query point, we address how to efficiently find $K$ codes in the dataset that yield the largest cosine similarities to the query. The straightforward answer to this problem is to compare…

Databases · Computer Science 2018-04-19 Sepehr Eghbali , Ladan Tahvildari

Comparison Based Nearest Neighbor Search

We consider machine learning in a comparison-based setting where we are given a set of points in a metric space, but we have no access to the actual distances between the points. Instead, we can only ask an oracle whether the distance…

Machine Learning · Statistics 2017-04-06 Siavash Haghiri , Debarghya Ghoshdastidar , Ulrike von Luxburg

Random construction of interpolating sets for high dimensional integration

Many high dimensional integrals can be reduced to the problem of finding the relative measures of two sets. Often one set will be exponentially larger than the other, making it difficult to compare the sizes. A standard method of dealing…

Probability · Mathematics 2011-12-19 Mark Huber , Sarah Schott

Noisy Sorting Capacity

Sorting is the task of ordering $n$ elements using pairwise comparisons. It is well known that $m=\Theta(n\log n)$ comparisons are both necessary and sufficient when the outcomes of the comparisons are observed with no noise. In this paper,…

Information Theory · Computer Science 2024-07-09 Ziao Wang , Nadim Ghaddar , Banghua Zhu , Lele Wang

Tradeoffs Between Information and Ordinal Approximation for Bipartite Matching

We study ordinal approximation algorithms for maximum-weight bipartite matchings. Such algorithms only know the ordinal preferences of the agents/nodes in the graph for their preferred matches, but must compete with fully omniscient…

Computer Science and Game Theory · Computer Science 2017-07-07 Elliot Anshelevich , Wennan Zhu

Bounds and Identification of Joint Probabilities of Potential Outcomes and Observed Variables under Monotonicity Assumptions

Evaluating joint probabilities of potential outcomes and observed variables, and their linear combinations, is a fundamental challenge in causal inference. This paper addresses the bounding and identification of these probabilities in…

Machine Learning · Statistics 2026-02-24 Naoya Hashimoto , Yuta Kawakami , Jin Tian

Approximate Neighbor Counting in Radio Networks

For many distributed algorithms, neighborhood size is an important parameter. In radio networks, however, obtaining this information can be difficult due to ad hoc deployments and communication that occurs on a collision-prone shared…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-09 Calvin Newport , Chaodong Zheng

Rates of Convergence for Large-scale Nearest Neighbor Classification

Nearest neighbor is a popular class of classification methods with many desirable properties. For a large data set which cannot be loaded into the memory of a single machine due to computation, communication, privacy, or ownership…

Machine Learning · Statistics 2019-11-01 Xingye Qiao , Jiexin Duan , Guang Cheng

Hashing with Mutual Information

Binary vector embeddings enable fast nearest neighbor retrieval in large databases of high-dimensional objects, and play an important role in many practical applications, such as image and video retrieval. We study the problem of learning…

Computer Vision and Pattern Recognition · Computer Science 2018-06-26 Fatih Cakir , Kun He , Sarah Adel Bargal , Stan Sclaroff

New Upper Bounds on A(n,d)

Upper bounds on the maximum number of codewords in a binary code of a given length and minimum Hamming distance are considered. New bounds are derived by a combination of linear programming and counting arguments. Some of these bounds…

Information Theory · Computer Science 2007-07-13 Beniamin Mounits , Tuvi Etzion , Simon Litsyn

In search of maximum non-overlapping codes

Non-overlapping codes are block codes that have arisen in diverse contexts of computer science and biology. Applications typically require finding non-overlapping codes with large cardinalities, but the maximum size of non-overlapping codes…

Information Theory · Computer Science 2024-01-11 Lidija Stanovnik , Miha Moškon , Miha Mraz

Entropy based Nearest Neighbor Search in High Dimensions

In this paper we study the problem of finding the approximate nearest neighbor of a query point in the high dimensional space, focusing on the Euclidean space. The earlier approaches use locality-preserving hash functions (that tend to map…

Data Structures and Algorithms · Computer Science 2007-05-23 Rina Panigrahy

Analytical calculation of neighborhood order probabilities for high dimensional Poissonic processes and mean field models

Consider that the coordinates of $N$ points are randomly generated along the edges of a $d$-dimensional hypercube (random point problem). The probability that an arbitrary point is the $m$th nearest neighbor to its own $n$th nearest…

Disordered Systems and Neural Networks · Physics 2007-05-23 Cesar Augusto Sangaletti Tercariol , Felipe de Mouta Kiipper , Alexandre Souto Martinez

Bucket Elimination: A Unifying Framework for Several Probabilistic Inference

Probabilistic inference algorithms for finding the most probable explanation, the maximum aposteriori hypothesis, and the maximum expected utility and for updating belief are reformulated as an elimination--type algorithm called bucket…

Artificial Intelligence · Computer Science 2013-02-18 Rina Dechter

(2,1)-separating systems beyond the probabilistic bound

Building on previous results of Xing, we give new lower bounds on the rate of intersecting codes over large alphabets. The proof is constructive, and uses algebraic geometry, although nothing beyond the basic theory of linear systems on…

Combinatorics · Mathematics 2012-01-11 Hugues Randriambololona

Approximate Nearest Neighbors in Limited Space

We consider the $(1+\epsilon)$-approximate nearest neighbor search problem: given a set $X$ of $n$ points in a $d$-dimensional space, build a data structure that, given any query point $y$, finds a point $x \in X$ whose distance to $y$ is…

Data Structures and Algorithms · Computer Science 2018-07-03 Piotr Indyk , Tal Wagner

Nearest Neighbor Classification based on Imbalanced Data: A Statistical Approach

When the competing classes in a classification problem are not of comparable size, many popular classifiers exhibit a bias towards larger classes, and the nearest neighbor classifier is no exception. To take care of this problem, we develop…

Methodology · Statistics 2023-11-02 Anvit Garg , Anil K. Ghosh , Soham Sarkar