Related papers: Efficient Truncated Statistics with Unknown Trunca…

Efficiently Learning Structured Distributions from Untrusted Batches

We study the problem, introduced by Qiao and Valiant, of learning from untrusted batches. Here, we assume $m$ users, all of whom have samples from some underlying distribution $p$ over $1, \ldots, n$. Each user sends a batch of $k$ i.i.d.…

Data Structures and Algorithms · Computer Science 2019-11-07 Sitan Chen , Jerry Li , Ankur Moitra

Subsampling for General Statistics under Long Range Dependence with application to change point analysis

In the statistical inference for long range dependent time series the shape of the limit distribution typically depends on unknown parameters. Therefore, we propose to use subsampling. We show the validity of subsampling for general…

Statistics Theory · Mathematics 2016-10-20 Annika Betken , Martin Wendler

The Numerical Generalized Least-Squares Estimator of an Unknown Constant Mean of Random Field

We constraint on computer the best linear unbiased generalized statistics of random field for the best linear unbiased generalized statistics of an unknown constant mean of random field and derive the numerical generalized least-squares…

Numerical Analysis · Computer Science 2011-11-18 Tomasz Suslo

A Statistical Model for Predicting Generalization in Few-Shot Classification

The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to…

Machine Learning · Computer Science 2023-03-29 Yassir Bendou , Vincent Gripon , Bastien Pasdeloup , Lukas Mauch , Stefan Uhlich , Fabien Cardinaux , Ghouthi Boukli Hacene , Javier Alonso Garcia

Bayesian Optimization under Uncertainty for Training a Scale Parameter in Stochastic Models

Hyperparameter tuning is a challenging problem especially when the system itself involves uncertainty. Due to noisy function evaluations, optimization under uncertainty can be computationally expensive. In this paper, we present a novel…

Machine Learning · Computer Science 2025-10-09 Akash Yadav , Ruda Zhang

Robust Sparse Mean Estimation via Sum of Squares

We study the problem of high-dimensional sparse mean estimation in the presence of an $\epsilon$-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance…

Data Structures and Algorithms · Computer Science 2024-07-08 Ilias Diakonikolas , Daniel M. Kane , Sushrut Karmalkar , Ankit Pensia , Thanasis Pittas

Efficient batch-sequential Bayesian optimization with moments of truncated Gaussian vectors

We deal with the efficient parallelization of Bayesian global optimization algorithms, and more specifically of those based on the expected improvement criterion and its variants. A closed form formula relying on multivariate Gaussian…

Machine Learning · Statistics 2016-09-12 Sébastien Marmin , Clément Chevalier , David Ginsbourger

Improving Output Uncertainty Estimation and Generalization in Deep Learning via Neural Network Gaussian Processes

We propose a simple method that combines neural networks and Gaussian processes. The proposed method can estimate the uncertainty of outputs and flexibly adjust target functions where training data exist, which are advantages of Gaussian…

Machine Learning · Statistics 2017-07-20 Tomoharu Iwata , Zoubin Ghahramani

Adaptive truncation of infinite sums: applications to Statistics

It is often the case in Statistics that one needs to compute sums of infinite series, especially in marginalising over discrete latent variables. This has become more relevant with the popularization of gradient-based techniques (e.g.…

Methodology · Statistics 2025-03-11 Luiz Max Carvalho , Wellington J. Silva , Guido A. Moreira

Linear regression through PAC-Bayesian truncation

We consider the problem of predicting as well as the best linear combination of d given functions in least squares regression under L^\infty constraints on the linear combination. When the input distribution is known, there already exists…

Statistics Theory · Mathematics 2011-09-14 Jean-Yves Audibert , Olivier Catoni

Learning Sparse Fixed-Structure Gaussian Bayesian Networks

Gaussian Bayesian networks (a.k.a. linear Gaussian structural equation models) are widely used to model causal interactions among continuous variables. In this work, we study the problem of learning a fixed-structure Gaussian Bayesian…

Data Structures and Algorithms · Computer Science 2022-10-19 Arnab Bhattacharyya , Davin Choo , Rishikesh Gajjala , Sutanu Gayen , Yuhao Wang

Fast and Robust: Computationally Efficient Covariance Estimation for Sub-Weibull Vectors

High-dimensional covariance estimation is notoriously sensitive to outliers. While statistically optimal estimators exist for general heavy-tailed distributions, they often rely on computationally expensive techniques like semidefinite…

Machine Learning · Statistics 2026-01-06 Even He

Distributed Stochastic Graph Algorithms

We study stochastic graph optimization problems in a novel distributed setting. As in the standard centralized setting, a random subgraph $G^*$ of a known base graph $G$ is realized by including each edge $e$ independently with a known…

Data Structures and Algorithms · Computer Science 2026-05-21 Keren Censor-Hillel , Aditi Dudeja , George Giakkoupis

Truncated Linear Regression in High Dimensions

As in standard linear regression, in truncated linear regression, we are given access to observations $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_i^{\rm T} \cdot x^* + \eta_i$, where $x^*$ is some fixed unknown vector of interest…

Machine Learning · Computer Science 2020-07-30 Constantinos Daskalakis , Dhruv Rohatgi , Manolis Zampetakis

A New Approach of Point Estimation from Truncated or Grouped and Censored Data

We propose a new approach for estimating the parameters of a probability distribution. It consists on combining two new methods of estimation. The first is based on the definition of a new distance measuring the difference between…

Methodology · Statistics 2008-12-30 Ahmed Guellil , Tewfik Kernane

Efficient Estimation for Generalized Linear Models on a Distributed System with Nonrandomly Distributed Data

Distributed systems have been widely used in practice to accomplish data analysis tasks of huge scales. In this work, we target on the estimation problem of generalized linear models on a distributed system with nonrandomly distributed…

Methodology · Statistics 2020-04-07 Feifei Wang , Danyang Huang , Yingqiu Zhu , Hansheng Wang

Statistical Inference on High Dimensional Gaussian Graphical Regression Models

Gaussian graphical regressions have emerged as a powerful approach for regressing the precision matrix of a Gaussian graphical model on covariates, which, unlike traditional Gaussian graphical models, can help determine how graphs are…

Methodology · Statistics 2025-01-17 Xuran Meng , Jingfei Zhang , Yi Li

Lower Bounds and a Near-Optimal Shrinkage Estimator for Least Squares using Random Projections

In this work, we consider the deterministic optimization using random projections as a statistical estimation problem, where the squared distance between the predictions from the estimator and the true solution is the error metric. In…

Optimization and Control · Mathematics 2020-06-16 Srivatsan Sridhar , Mert Pilanci , Ayfer Özgür

Mixture Models, Robustness, and Sum of Squares Proofs

We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved…

Data Structures and Algorithms · Computer Science 2017-11-21 Samuel B. Hopkins , Jerry Li

Constrained Least Squares for Extended Complex Factor Analysis

For subspace estimation with an unknown colored noise, Factor Analysis (FA) is a good candidate for replacing the popular eigenvalue decomposition (EVD). Finding the unknowns in factor analysis can be done by solving a non-linear least…

Computation · Statistics 2018-04-03 Ahmad Mouri Sardarabadi , Alle-Jan van der Veen , L. V. E. Koopmans