Related papers: Estimating Random Variables from Random Sparse Obs…
In this paper, we study the challenge of feature selection based on a relatively small collection of sample pairs $\{(x_i, y_i)\}_{1 \leq i \leq m}$. The observations $y_i \in \mathbb{R}$ are thereby supposed to follow a noisy single-index…
Many existing approaches for estimating parameters in settings with distributional shifts operate under an invariance assumption. For example, under covariate shift, it is assumed that $p(y|x)$ remains invariant. We refer to such…
Random models of evolution are instrumental in extracting rates of microscopic evolutionary mechanisms from empirical observations on genetic variation in genome sequences. In this context it is necessary to know the statistical properties…
We consider a situation where the distribution of a random variable is being estimated by the empirical distribution of noisy measurements of that variable. This is common practice in, for example, teacher value-added models and other…
We consider a collection of independent random variables that are identically distributed, except for a small subset which follows a different, anomalous distribution. We study the problem of detecting which random variables in the…
Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed…
We derive fundamental sample complexity bounds for recovering sparse and structured signals for linear and nonlinear observation models including sparse regression, group testing, multivariate regression and problems with missing features.…
We consider the asymptotic behavior of posterior distributions and Bayes estimators based on observations which are required to be neither independent nor identically distributed. We give general results on the rate of convergence of the…
In many contexts such as queuing theory, spatial statistics, geostatistics and meteorology, data are observed at irregular spatial positions. One model of this situation involves considering the observation points as generated by a Poisson…
This paper studies the problem of {\em learning} the probability distribution $P_X$ of a discrete random variable $X$ using indirect and sequential samples. At each time step, we choose one of the possible $K$ functions, $g_1, \ldots, g_K$…
We propose and analyze a generalized splitting method to sample approximately from a distribution conditional on the occurrence of a rare event. This has important applications in a variety of contexts in operations research, engineering,…
We consider univariate regression estimation from an individual (non-random) sequence $(x_1,y_1),(x_2,y_2), ... \in \real \times \real$, which is stable in the sense that for each interval $A \subseteq \real$, (i) the limiting relative…
The prior distribution on parameters of a sampling distribution is the usual starting point for Bayesian uncertainty quantification. In this paper, we present a different perspective which focuses on missing observations as the source of…
Discovering causal relations is fundamental to reasoning and intelligence. In particular, observational causal discovery algorithms estimate the cause-effect relation between two random entities $X$ and $Y$, given $n$ samples from $P(X,Y)$.…
Let $(X,Y)\in\mathcal{X}\times \mathcal{Y}$ be a random couple with unknown distribution $P$. Let $\GG$ be a class of measurable functions and $\ell$ a loss function. The problem of statistical learning deals with the estimation of the…
Recently established, directed dependence measures for pairs $(X,Y)$ of random variables build upon the natural idea of comparing the conditional distributions of $Y$ given $X=x$ with the marginal distribution of $Y$. They assign pairs…
We examine the linear regression problem in a challenging high-dimensional setting with correlated predictors where the vector of coefficients can vary from sparse to dense. In this setting, we propose a combination of probabilistic…
We consider a sparse high-dimensional varying coefficients model with random effects, a flexible linear model allowing covariates and coefficients to have a functional dependence with time. For each individual, we observe discretely sampled…
Extreme value statistics provides accurate estimates for the small occurrence probabilities of rare events. While theory and statistical tools for univariate extremes are well-developed, methods for high-dimensional and complex data sets…
The problem of statistical learning is to construct a predictor of a random variable $Y$ as a function of a related random variable $X$ on the basis of an i.i.d. training sample from the joint distribution of $(X,Y)$. Allowable predictors…