English
Related papers

Related papers: Efficient Truncated Statistics with Unknown Trunca…

200 papers

We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches. Here, we assume $m$ users, all of whom have samples from some underlying distribution $p$ over $1, \ldots, n$. Each user sends a batch of $k$ i.i.d.…

Data Structures and Algorithms · Computer Science 2019-11-07 Sitan Chen , Jerry Li , Ankur Moitra

In the statistical inference for long range dependent time series the shape of the limit distribution typically depends on unknown parameters. Therefore, we propose to use subsampling. We show the validity of subsampling for general…

Statistics Theory · Mathematics 2016-10-20 Annika Betken , Martin Wendler

We constraint on computer the best linear unbiased generalized statistics of random field for the best linear unbiased generalized statistics of an unknown constant mean of random field and derive the numerical generalized least-squares…

Numerical Analysis · Computer Science 2011-11-18 Tomasz Suslo

The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to…

Hyperparameter tuning is a challenging problem especially when the system itself involves uncertainty. Due to noisy function evaluations, optimization under uncertainty can be computationally expensive. In this paper, we present a novel…

Machine Learning · Computer Science 2025-10-09 Akash Yadav , Ruda Zhang

We study the problem of high-dimensional sparse mean estimation in the presence of an $\epsilon$-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance…

Data Structures and Algorithms · Computer Science 2024-07-08 Ilias Diakonikolas , Daniel M. Kane , Sushrut Karmalkar , Ankit Pensia , Thanasis Pittas

We deal with the efficient parallelization of Bayesian global optimization algorithms, and more specifically of those based on the expected improvement criterion and its variants. A closed form formula relying on multivariate Gaussian…

Machine Learning · Statistics 2016-09-12 Sébastien Marmin , Clément Chevalier , David Ginsbourger

We propose a simple method that combines neural networks and Gaussian processes. The proposed method can estimate the uncertainty of outputs and flexibly adjust target functions where training data exist, which are advantages of Gaussian…

Machine Learning · Statistics 2017-07-20 Tomoharu Iwata , Zoubin Ghahramani

It is often the case in Statistics that one needs to compute sums of infinite series, especially in marginalising over discrete latent variables. This has become more relevant with the popularization of gradient-based techniques (e.g.…

Methodology · Statistics 2025-03-11 Luiz Max Carvalho , Wellington J. Silva , Guido A. Moreira

We consider the problem of predicting as well as the best linear combination of d given functions in least squares regression under L^\infty constraints on the linear combination. When the input distribution is known, there already exists…

Statistics Theory · Mathematics 2011-09-14 Jean-Yves Audibert , Olivier Catoni

Gaussian Bayesian networks (a.k.a. linear Gaussian structural equation models) are widely used to model causal interactions among continuous variables. In this work, we study the problem of learning a fixed-structure Gaussian Bayesian…

Data Structures and Algorithms · Computer Science 2022-10-19 Arnab Bhattacharyya , Davin Choo , Rishikesh Gajjala , Sutanu Gayen , Yuhao Wang

High-dimensional covariance estimation is notoriously sensitive to outliers. While statistically optimal estimators exist for general heavy-tailed distributions, they often rely on computationally expensive techniques like semidefinite…

Machine Learning · Statistics 2026-01-06 Even He

We study stochastic graph optimization problems in a novel distributed setting. As in the standard centralized setting, a random subgraph $G^*$ of a known base graph $G$ is realized by including each edge $e$ independently with a known…

Data Structures and Algorithms · Computer Science 2026-05-21 Keren Censor-Hillel , Aditi Dudeja , George Giakkoupis

As in standard linear regression, in truncated linear regression, we are given access to observations $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_i^{\rm T} \cdot x^* + \eta_i$, where $x^*$ is some fixed unknown vector of interest…

Machine Learning · Computer Science 2020-07-30 Constantinos Daskalakis , Dhruv Rohatgi , Manolis Zampetakis

We propose a new approach for estimating the parameters of a probability distribution. It consists on combining two new methods of estimation. The first is based on the definition of a new distance measuring the difference between…

Methodology · Statistics 2008-12-30 Ahmed Guellil , Tewfik Kernane

Distributed systems have been widely used in practice to accomplish data analysis tasks of huge scales. In this work, we target on the estimation problem of generalized linear models on a distributed system with nonrandomly distributed…

Methodology · Statistics 2020-04-07 Feifei Wang , Danyang Huang , Yingqiu Zhu , Hansheng Wang

Gaussian graphical regressions have emerged as a powerful approach for regressing the precision matrix of a Gaussian graphical model on covariates, which, unlike traditional Gaussian graphical models, can help determine how graphs are…

Methodology · Statistics 2025-01-17 Xuran Meng , Jingfei Zhang , Yi Li

In this work, we consider the deterministic optimization using random projections as a statistical estimation problem, where the squared distance between the predictions from the estimator and the true solution is the error metric. In…

Optimization and Control · Mathematics 2020-06-16 Srivatsan Sridhar , Mert Pilanci , Ayfer Özgür

We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved…

Data Structures and Algorithms · Computer Science 2017-11-21 Samuel B. Hopkins , Jerry Li

For subspace estimation with an unknown colored noise, Factor Analysis (FA) is a good candidate for replacing the popular eigenvalue decomposition (EVD). Finding the unknowns in factor analysis can be done by solving a non-linear least…

Computation · Statistics 2018-04-03 Ahmad Mouri Sardarabadi , Alle-Jan van der Veen , L. V. E. Koopmans