Related papers: An Algorithmic and a geometric characterization of…
We show that the class of conditional distributions satisfying the coarsening at Random (CAR) property for discrete data has a simple and robust algorithmic description based on randomized uniform multicovers: combinatorial objects…
In recent years a popular nonparametric model for coarsened data is an assumption on the coarsening mechanism called coarsening at random (CAR). It has been conjectured in several papers that this assumption cannot be tested by the data,…
We develop estimation for potentially high-dimensional additive structural equation models. A key component of our approach is to decouple order search among the variables from feature or edge selection in a directed acyclic graph encoding…
Higher-dimensional orthogonal packing problems have a wide range of practical applications, including packing, cutting, and scheduling. Previous efforts for exact algorithms have been unable to avoid structural problems that appear for…
Conditional auto-regressive (CAR) distributions are widely used to induce spatial dependence in the geographic analysis of areal data. These distributions establish multivariate dependence networks by defining conditional relationships…
In many applications involving binary variables, only pairwise dependence measures, such as correlations, are available. However, for multi-way tables involving more than two variables, these quantities do not uniquely determine the joint…
We study the predictability of emergent phenomena in complex systems. Using nearest neighbor, one-dimensional Cellular Automata (CA) as an example, we show how to construct local coarse-grained descriptions of CA in all classes of Wolfram's…
Standard supervised learning optimizes for predictive accuracy but remains agnostic to the internal geometry of learned features, often yielding representations that are entangled and brittle. We propose Class-Conditional Activation…
Causal Abstraction (CA) theory provides a principled framework for relating causal models that describe the same system at different levels of granularity while ensuring interventional consistency between them. Recent methods for learning…
Covariate-adaptive randomization (CAR) procedures are frequently used in comparative studies to increase the covariate balance across treatment groups. However, because randomization inevitably uses the covariate information when forming…
A Hadamard-Hitchcock decomposition of a multidimensional array is a decomposition that expresses the latter as a Hadamard product of several tensor rank decompositions. Such decompositions can encode probability distributions that arise…
We clarify relationships between conditional (CAR) and simultaneous (SAR) autoregressive models. We review the literature on this topic and find that it is mostly incomplete. Our main result is that a SAR model can be written as a unique…
We study generalised additive models, with shape restrictions (e.g. monotonicity, convexity, concavity) imposed on each component of the additive prediction function. We show that this framework facilitates a nonparametric estimator of each…
We develop an algorithmic theory of convex optimization over discrete sets. Using a combination of algebraic and geometric tools we are able to provide polynomial time algorithms for solving broad classes of convex combinatorial…
In distributionally robust optimization the probability distribution of the uncertain problem parameters is itself uncertain, and a fictitious adversary, e.g., nature, chooses the worst distribution from within a known ambiguity set. A…
The Heard-Of model is a simple and relatively expressive model of distributed computation. Because of this, it has gained a considerable attention of the verification community. We give a characterization of all algorithms solving consensus…
Multivariate information theory provides a general and principled framework for understanding how the components of a complex system are connected. Existing analyses are coarse in nature -- built up from characterizations of discrete…
We give a principled method for decomposing the predictive uncertainty of a model into aleatoric and epistemic components with explicit semantics relating them to the real-world data distribution. While many works in the literature have…
We develop a model to describe the properties of random assemblies of polydisperse hard spheres. We show that the key features to describe the system are (i) the dependence between the free volume of a sphere and the various coordination…
We formulate the statistics of the discrete multicomponent fragmentation event using a methodology borrowed from statistical mechanics. We generate the ensemble of all feasible distributions that can be formed when a single integer…