Related papers: Statistical Estimation from Dependent Data
The standard linear and logistic regression models assume that the response variables are independent, but share the same linear relationship to their corresponding vectors of covariates. The assumption that the response variables are…
We investigate the problem of statistical inference for logistic regression with high-dimensional covariates in settings where dependence among individuals is induced by an underlying Markov random field. Going beyond the pairwise…
In many supervised learning tasks, the entities to be labeled are related to each other in complex ways and their labels are not independent. For example, in hypertext classification, the labels of linked pages are highly correlated. A…
Logistic regression is key method for modeling the probability of a binary outcome based on a collection of covariates. However, the classical formulation of logistic regression relies on the independent sampling assumption, which is often…
Gaussian mixture models are widely used to model data generated from multiple latent sources. Despite its popularity, most theoretical research assumes that the labels are either independent and identically distributed, or follows a Markov…
In this paper, we focus on the problem of statistical dependence estimation using characteristic functions. We propose a statistical dependence measure, based on the maximum-norm of the difference between joint and product-marginal…
Recently, graph (network) data is an emerging research area in artificial intelligence, machine learning and statistics. In this work, we are interested in whether node's labels (people's responses) are affected by their neighbor's features…
In a large social network whose members harbor binary sentiments towards an issue, we investigate the asymptotic accuracy of sentiment detection. We model the user sentiments by an Ising Markov random field model and allow the user…
The Ising model is a celebrated example of a Markov random field, introduced in statistical physics to model ferromagnetism. This is a discrete exponential family with binary outcomes, where the sufficient statistic involves a quadratic…
Measuring conditional dependencies among the variables of a network is of great interest to many disciplines. This paper studies some shortcomings of the existing dependency measures in detecting direct causal influences or their lack of…
We present a novel deep learning method for estimating time-dependent parameters in Markov processes through discrete sampling. Departing from conventional machine learning, our approach reframes parameter approximation as an optimization…
Dependency networks (Heckerman et al., 2000) provide a flexible framework for modeling complex systems with many variables by combining independently learned local conditional distributions through pseudo-Gibbs sampling. Despite their…
Existing methods for differentiable structure learning in discrete data typically assume that the data are generated from specific structural equation models. However, these assumptions may not align with the true data-generating process,…
The assumption that data samples are independent and identically distributed (iid) is standard in many areas of statistics and machine learning. Nevertheless, in some settings, such as social networks, infectious disease modeling, and…
This paper deals with variable selection in multivariate linear regression model when the data are observations on a spatial domain being a grid of sites in $\mathbb{Z}^d$ with $d\geqslant 2$. We use a criterion that allows to characterize…
Dependency networks (Heckerman et al., 2000) are potential probabilistic graphical models for systems comprising a large number of variables. Like Bayesian networks, the structure of a dependency network is represented by a directed graph,…
An inductive probabilistic classification rule must generally obey the principles of Bayesian predictive inference, such that all observed and unobserved stochastic quantities are jointly modeled and the parameter uncertainty is fully…
Statistical dependence measures like mutual information is ideal for analyzing autoencoders, but it can be ill-posed for deterministic, static, noise-free networks. We adopt the variational (Gaussian) formulation that makes dependence among…
We propose a simple and efficient method for ranking features in multi-label classification. The method produces a ranking of features showing their relevance in predicting labels, which in turn allows to choose a final subset of features.…
The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the "in-domain" test data is drawn from a distribution…