English
Related papers

Related papers: Efficient Truncated Statistics with Unknown Trunca…

200 papers

Accurate assessment of systematic uncertainties is an increasingly vital task in physics studies, where large, high-dimensional datasets, like those collected at the Large Hadron Collider, hold the key to new discoveries. Common approaches…

Methodology · Statistics 2025-10-02 Alexis Romero , Kyle Cranmer , Daniel Whiteson

Consider the problem of estimating the mean of a Gaussian random vector when the mean vector is assumed to be in a given convex set. The most natural solution is to take the Euclidean projection of the data vector on to this convex set; in…

Statistics Theory · Mathematics 2014-11-21 Sourav Chatterjee

We show a statistical version of Taylor's theorem and apply this result to non-parametric density estimation from truncated samples, which is a classical challenge in Statistics \cite{woodroofe1985estimating, stute1993almost}. The…

Statistics Theory · Mathematics 2021-07-01 Constantinos Daskalakis , Vasilis Kontonis , Christos Tzamos , Manolis Zampetakis

We study a distributed estimation problem in which two remotely located parties, Alice and Bob, observe an unlimited number of i.i.d. samples corresponding to two different parts of a random vector. Alice can send $k$ bits on average to…

Statistics Theory · Mathematics 2018-06-26 Uri Hadar , Ofer Shayevitz

We consider the estimation of a sparse parameter vector from measurements corrupted by white Gaussian noise. Our focus is on unbiased estimation as a setting under which the difficulty of the problem can be quantified analytically. We show…

Information Theory · Computer Science 2010-02-02 Alexander Jung , Zvika Ben-Haim , Franz Hlawatsch , Yonina C. Eldar

The majority of research on efficient and scalable algorithms in computational science and engineering has focused on the forward problem: given parameter inputs, solve the governing equations to determine output quantities of interest. In…

Optimization and Control · Mathematics 2015-11-05 Tobin Isaac , Noemi Petra , Georg Stadler , Omar Ghattas

We consider the problem of model selection in Gaussian Markov fields in the sample deficient scenario. The benchmark information-theoretic results in the case of d-regular graphs require the number of samples to be at least proportional to…

Machine Learning · Statistics 2018-03-30 Ilya Soloveychik , Vahid Tarokh

Parametric stochastic simulators are ubiquitous in science, often featuring high-dimensional input parameters and/or an intractable likelihood. Performing Bayesian parameter inference in this context can be challenging. We present a neural…

Machine Learning · Statistics 2021-10-27 Benjamin Kurt Miller , Alex Cole , Patrick Forré , Gilles Louppe , Christoph Weniger

Robust mean estimation is one of the most important problems in statistics: given a set of samples in $\mathbb{R}^d$ where an $\alpha$ fraction are drawn from some distribution $D$ and the rest are adversarially corrupted, we aim to…

Machine Learning · Computer Science 2022-12-07 Shiwei Zeng , Jie Shen

Randomized algorithms, such as randomized sketching or stochastic optimization, are a promising approach to ease the computational burden in analyzing large datasets. However, randomized algorithms also produce non-deterministic outputs,…

Methodology · Statistics 2025-05-13 Zhixiang Zhang , Sokbae Lee , Edgar Dobriban

Low-rank tensor models are widely used in statistics. However, most existing methods rely heavily on the assumption that data follows a sub-Gaussian distribution. To address the challenges associated with heavy-tailed distributions…

Methodology · Statistics 2025-09-16 Xiaoyu Zhang , Di Wang , Guodong Li , Defeng Sun

The article starts with new aliasing-truncation error upper bounds in the sampling theorem for non-bandlimited stochastic signals. Then, it investigates $L_p([0,T])$ approximations of sub-Gaussian random signals. Explicit truncation error…

Information Theory · Computer Science 2016-08-15 Yuriy Kozachenko , Andriy Olenko

In high-dimensional Bayesian statistics, various methods have been developed, including prior distributions that induce parameter sparsity to handle many parameters. Yet, these approaches often overlook the rich spectral structure of the…

Statistics Theory · Mathematics 2025-05-06 Tomoya Wakayama , Masaaki Imaizumi

We consider the problem of identifying the parameters of an unknown mixture of two arbitrary $d$-dimensional gaussians from a sequence of independent random samples. Our main results are upper and lower bounds giving a computationally…

Machine Learning · Computer Science 2015-05-19 Moritz Hardt , Eric Price

The truncated plurigaussian model is often used to simulate the spatial distribution of random categorical variables such as geological facies. The problems addressed in this paper are the estimation of parameters of the truncation map for…

Statistics Theory · Mathematics 2015-08-07 Alina Astrakova , Dean S. Oliver , Christian Lantuéjoul

Algorithmic Gaussianization is a phenomenon that can arise when using randomized sketching or sampling methods to produce smaller representations of large datasets: For certain tasks, these sketched representations have been observed to…

Machine Learning · Computer Science 2023-07-28 Michał Dereziński

We introduce the truncated Gaussian graphical model (TGGM) as a novel framework for designing statistical models for nonlinear learning. A TGGM is a Gaussian graphical model (GGM) with a subset of variables truncated to be nonnegative. The…

Machine Learning · Statistics 2016-11-22 Qinliang Su , Xuejun Liao , Changyou Chen , Lawrence Carin

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic…

Machine Learning · Computer Science 2026-02-27 Alkis Kalavasis , Anay Mehrotra , Manolis Zampetakis , Felix Zhou , Ziyu Zhu

Scalable Gaussian Process methods are computationally attractive, yet introduce modeling biases that require rigorous study. This paper analyzes two common techniques: early truncated conjugate gradients (CG) and random Fourier features…

Machine Learning · Computer Science 2021-06-30 Andres Potapczynski , Luhuan Wu , Dan Biderman , Geoff Pleiss , John P. Cunningham

We study the problem of list-decodable Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. We develop a set of techniques that yield new efficient algorithms with significantly improved…

Data Structures and Algorithms · Computer Science 2017-11-21 Ilias Diakonikolas , Daniel M. Kane , Alistair Stewart