Related papers: Efficient Truncated Statistics with Unknown Trunca…
We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches. Here, we assume $m$ users, all of whom have samples from some underlying distribution $p$ over $1, \ldots, n$. Each user sends a batch of $k$ i.i.d.…
In the statistical inference for long range dependent time series the shape of the limit distribution typically depends on unknown parameters. Therefore, we propose to use subsampling. We show the validity of subsampling for general…
We constraint on computer the best linear unbiased generalized statistics of random field for the best linear unbiased generalized statistics of an unknown constant mean of random field and derive the numerical generalized least-squares…
The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to…
Hyperparameter tuning is a challenging problem especially when the system itself involves uncertainty. Due to noisy function evaluations, optimization under uncertainty can be computationally expensive. In this paper, we present a novel…
We study the problem of high-dimensional sparse mean estimation in the presence of an $\epsilon$-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance…
We deal with the efficient parallelization of Bayesian global optimization algorithms, and more specifically of those based on the expected improvement criterion and its variants. A closed form formula relying on multivariate Gaussian…
We propose a simple method that combines neural networks and Gaussian processes. The proposed method can estimate the uncertainty of outputs and flexibly adjust target functions where training data exist, which are advantages of Gaussian…
It is often the case in Statistics that one needs to compute sums of infinite series, especially in marginalising over discrete latent variables. This has become more relevant with the popularization of gradient-based techniques (e.g.…
We consider the problem of predicting as well as the best linear combination of d given functions in least squares regression under L^\infty constraints on the linear combination. When the input distribution is known, there already exists…
Gaussian Bayesian networks (a.k.a. linear Gaussian structural equation models) are widely used to model causal interactions among continuous variables. In this work, we study the problem of learning a fixed-structure Gaussian Bayesian…
High-dimensional covariance estimation is notoriously sensitive to outliers. While statistically optimal estimators exist for general heavy-tailed distributions, they often rely on computationally expensive techniques like semidefinite…
We study stochastic graph optimization problems in a novel distributed setting. As in the standard centralized setting, a random subgraph $G^*$ of a known base graph $G$ is realized by including each edge $e$ independently with a known…
As in standard linear regression, in truncated linear regression, we are given access to observations $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_i^{\rm T} \cdot x^* + \eta_i$, where $x^*$ is some fixed unknown vector of interest…
We propose a new approach for estimating the parameters of a probability distribution. It consists on combining two new methods of estimation. The first is based on the definition of a new distance measuring the difference between…
Distributed systems have been widely used in practice to accomplish data analysis tasks of huge scales. In this work, we target on the estimation problem of generalized linear models on a distributed system with nonrandomly distributed…
Gaussian graphical regressions have emerged as a powerful approach for regressing the precision matrix of a Gaussian graphical model on covariates, which, unlike traditional Gaussian graphical models, can help determine how graphs are…
In this work, we consider the deterministic optimization using random projections as a statistical estimation problem, where the squared distance between the predictions from the estimator and the true solution is the error metric. In…
We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved…
For subspace estimation with an unknown colored noise, Factor Analysis (FA) is a good candidate for replacing the popular eigenvalue decomposition (EVD). Finding the unknowns in factor analysis can be done by solving a non-linear least…