Related papers: Mutual Dimension
Advances in computational power and hardware efficiency have enabled tackling increasingly complex, high-dimensional problems. While artificial intelligence (AI) achieves remarkable results, the interpretability of high-dimensional…
A data processing inequality states that the quantity of shared information between two entities (e.g. signals, strings) cannot be significantly increased when one of the entities is processed by certain kinds of transformations. In this…
In 2004, Dai, Lathrop, Lutz, and Mayordomo defined and investigated the finite-state dimension (a finite-state version of algorithmic dimension) of a sequence $S \in \Sigma^\infty$ and, in 2018, Case and Lutz defined and investigated the…
High-dimensional Bayesian procedures often exhibit behavior that is effectively low dimensional, even when the ambient parameter space is large or infinite-dimensional. This phenomenon underlies the success of shrinkage priors,…
Let $f : X \lo Y$ be a map of compact metric spaces. A classical theorem of Hurewicz asserts that $\dim X \leq \dim Y +\dim f$ where $\dim f =\sup \{\dim f^{-1}(y): y \in Y \}$. The first author conjectured that {\em $\dim Y + \dim f$ in…
The mutual information between two jointly distributed random variables $X$ and $Y$ is a functional of the joint distribution $P_{XY},$ which is sometimes difficult to handle or estimate. A coarser description of the statistical behavior of…
Given a set of points in the Euclidean space $\mathbb{R}^\ell$ with $\ell>1$, the pairwise distances between the points are determined by their spatial location and the metric $d$ that we endow $\mathbb{R}^\ell$ with. Hence, the distance…
We introduce the Mutual Information Machine (MIM), a probabilistic auto-encoder for learning joint distributions over observations and latent variables. MIM reflects three design principles: 1) low divergence, to encourage the encoder and…
We investigate when the local Lipschitz property of the real-valued function $g(z) = d_Y (f(z),A)$ implies the global Lipschitz property of the mapping $f:X\to Y$ between the metric spaces $(X,d_X)$ and $(Y,d_Y)$. Here, $d_Y(y,A)$ denotes…
Whether the goal is to analyze voting behavior, locate facilities, or recommend products, the problem of translating between (ordinal) rankings and (numerical) utilities arises naturally in many contexts. This task is commonly approached by…
We prove that given a computable metric space and two computable measures, the set of points that have high universal uniform test scores with respect to the first measure will have a lower bound with respect to the second measure. This…
We study algorithmic problems on subsets of Euclidean space of low fractal dimension. These spaces are the subject of intensive study in various branches of mathematics, including geometry, topology, and measure theory. There are several…
We consider the question which compact metric spaces can be obtained as a Lipschitz image of the middle third Cantor set, or more generally, as a Lipschitz image of a subset of a given compact metric space. In the general case we prove that…
Understanding distance metrics in high-dimensional spaces is crucial for various fields such as data analysis, machine learning, and optimization. The Manhattan distance, a fundamental metric in multi-dimensional settings, measures the…
One of the most fundamental questions one can ask about a pair of random variables X and Y is the value of their mutual information. Unfortunately, this task is often stymied by the extremely large dimension of the variables. We might hope…
We consider two disjoint sets of points. If at least one of the sets can be embedded into an Euclidean space, then we provide sufficient conditions for the two sets to be jointly embedded in one Euclidean space. In this joint Euclidean…
Metric embeddings are a widely used method in algorithm design, where generally a ``complex'' metric is embedded into a simpler, lower-dimensional one. Historically, the theoretical computer science community has focused on bi-Lipschitz…
Estimating mutual information (MI) between two continuous random variables $X$ and $Y$ allows to capture non-linear dependencies between them, non-parametrically. As such, MI estimation lies at the core of many data science applications.…
The intrinsic dimensionality refers to the ``true'' dimensionality of the data, as opposed to the dimensionality of the data representation. For example, when attributes are highly correlated, the intrinsic dimensionality can be much lower…
Geometry and topology have generated impacts far beyond their pure mathematical primitive, providing a solid foundation for many applicable tools. Typically, real-world data are represented as vectors, forming a linear subspace for a given…