Related papers: Combinatorial information distance
Information distance can be defined not only between two strings but also in a finite multiset of strings of cardinality greater than two. We give an elementary proof for expressing the information distance in terms of plain Kolmogorov…
First we consider pair-wise distances for literal objects consisting of finite binary files. These files are taken to contain all of their meaning, like genomes or books. The distances are based on compression of the objects concerned,…
The normalized information distance is a universal distance measure for objects of all kinds. It is based on Kolmogorov complexity and thus uncomputable, but there are ways to utilize it. First, compression algorithms can be used to…
Information distance is a parameter-free similarity measure based on compression, used in pattern recognition, data mining, phylogeny, clustering, and classification. The notion of information distance is extended from pairs to multiples…
Kolmogorov argued that the concept of information exists also in problems with no underlying stochastic model (as Shannon's information representation) for instance, the information contained in an algorithm or in the genome. He introduced…
The concepts of similarity and distance are crucial in data mining. We consider the problem of defining the distance between two data sets by comparing summary statistics computed from the data sets. The initial definition of our distance…
We consider the notion of information distance between two objects $x$ and $y$ introduced by Bennett, G\'acs, Li, Vit\'anyi, and Zurek in 1998 as the minimal length of a program that computes $x$ from $y$ as well as computing $y$ from $x$.…
We consider the notion of information distance between two objects x and y introduced by Bennett, G\'acs, Li, Vitanyi, and Zurek [1] as the minimal length of a program that computes x from y as well as computing y from x, and study…
While Kolmogorov complexity is the accepted absolute measure of information content in an individual finite object, a similarly absolute notion is needed for the information distance between two individual objects, for example, two…
The information in an individual finite object (like a binary string) is commonly measured by its Kolmogorov complexity. One can divide that information into two parts: the information accounting for the useful regularity present in the…
After reviewing unnormalized and normalized information distances based on incomputable notions of Kolmogorov complexity, we discuss how Kolmogorov complexity can be approximated by data compression algorithms. We argue that optimal…
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which for practical purposes is approximated by the length of the compressed version of the file involved, using a real-world compression program.…
We develop general methods to obtain fast (polynomial time) estimates of the cardinality of a combinatorially defined set via solving some randomly generated optimization problems on the set. Geometrically, we estimate the cardinality of a…
Kolmogorov complexity is a measure of the information contained in a binary string. We investigate here the notion of quantum Kolmogorov complexity, a measure of the information required to describe a quantum state. We show that for any…
In analogy of classical Kolmogorov complexity we develop a theory of the algorithmic information in bits contained in any one of continuously many pure quantum states: quantum Kolmogorov complexity. Classical Kolmogorov complexity coincides…
The twenty-first century is a data-driven era where human activities and behavior, physical phenomena, scientific discoveries, technology advancements, and almost everything that happens in the world resulting in massive generation,…
The number-theoretic codes are a class of codes defined by single or multiple congruences and are mainly used for correcting insertion and deletion errors. Since the number-theoretic codes are generally non-linear, the analysis method for…
This paper is an extended abstract of the dissertation presented by the author for the doctoral degree in physics and mathematics (in Russia). The main characteristic studied in the dissertation is combinatorial complexity, which is a…
Two-party one-way quantum communication has been extensively studied in the recent literature. We target the size of minimal information that is necessary for a feasible party to finish a given combinatorial task, such as distinction of…
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which for practical purposes is approximated by the length of the compressed version of the file involved, using a real-world compression program.…