Related papers: Optimal Single Sample Tests for Structured versus …
The Gaussian graphical model is routinely employed to model the joint distribution of multiple random variables. The graph it induces is not only useful for describing the relationship between random variables but also critical for…
We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing,…
We consider a two-sample hypothesis testing problem, where the distributions are defined on the space of undirected graphs, and one has access to only one observation from each model. A motivating example for this problem is comparing the…
Hypothesis testing in singular statistical models is often regarded as inherently problematic due to non-identifiability and degeneracy of the Fisher information. We show that the fundamental obstruction to testing in such models is not…
We propose a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. Given samples from $s$ distributions, $p_1, p_2, \ldots, p_s$, we design testers for the…
In this paper we propose a nonparametric graphical test based on optimal matching, for assessing the equality of multiple unknown multivariate probability distributions. Our procedure pools the data from the different classes to create a…
Random geometric graphs are widely used in modeling geometry and dependence structure in networks. In a random geometric graph, nodes are independently generated from some probability distribution $F$ over a metric space, and edges link…
Distribution testing deals with what information can be deduced about an unknown distribution over $\{1,\ldots,n\}$, where the algorithm is only allowed to obtain a relatively small number of independent samples from the distribution. In…
Structure discovery in graphical models is the determination of the topology of a graph that encodes conditional independence properties of the joint distribution of all variables in the model. For some class of probability distributions,…
We examine the extent to which sublinear-sample property testing and estimation apply to settings where samples are independently but not identically distributed. Specifically, we consider the following distributional property testing…
Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution? Similarly, is it possible to distinguish whether…
In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear…
Distribution testing can be described as follows: $q$ samples are being drawn from some unknown distribution $P$ over a known domain $[n]$. After the sampling process, a decision must be made about whether $P$ holds some property, or is far…
Dependency networks (Heckerman et al., 2000) provide a flexible framework for modeling complex systems with many variables by combining independently learned local conditional distributions through pseudo-Gibbs sampling. Despite their…
Learning to sample from complex unnormalized distributions is a fundamental challenge in computational physics and machine learning. While score-based and variational methods have achieved success in continuous domains, extending them to…
We investigate the unsupervised node classification problem on random hypergraphs under the non-uniform Hypergraph Stochastic Block Model (HSBM) with two equal-sized communities. In this model, edges appear independently with probabilities…
In this paper, we study the task of detecting the edge dependency between two weighted random graphs. We formulate this task as a simple hypothesis testing problem, where under the null hypothesis, the two observed graphs are statistically…
Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing,…
There have been two separate lines of work on estimating Ising models: (1) estimating them from multiple independent samples under minimal assumptions about the model's interaction matrix; and (2) estimating them from one sample in…
We give a general unified method that can be used for $L_1$ {\em closeness testing} of a wide range of univariate structured distribution families. More specifically, we design a sample optimal and computationally efficient algorithm for…