Related papers: Making Progress Based on False Discoveries

Subsampling Suffices for Adaptive Data Analysis

Ensuring that analyses performed on a dataset are representative of the entire population is one of the central problems in statistics. Most classical techniques assume that the dataset is independent of the analyst's query and break down…

Machine Learning · Computer Science 2024-09-25 Guy Blanc

Preventing False Discovery in Interactive Data Analysis is Hard

We show that, under a standard hardness assumption, there is no computationally efficient algorithm that given $n$ samples from an unknown distribution can give valid answers to $n^{3+o(1)}$ adaptively chosen statistical queries. A…

Machine Learning · Computer Science 2014-08-08 Moritz Hardt , Jonathan Ullman

Gradient-based Sampling: An Adaptive Importance Sampling for Least-squares

In modern data analysis, random sampling is an efficient and widely-used strategy to overcome the computational difficulties brought by large sample size. In previous studies, researchers conducted random sampling which is according to the…

Machine Learning · Statistics 2018-03-05 Rong Zhu

Algorithmic Stability for Adaptive Data Analysis

Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model,…

Machine Learning · Computer Science 2015-11-10 Raef Bassily , Kobbi Nissim , Adam Smith , Thomas Steinke , Uri Stemmer , Jonathan Ullman

More General Queries and Less Generalization Error in Adaptive Data Analysis

Adaptivity is an important feature of data analysis---typically the choice of questions asked about a dataset depends on previous interactions with the same dataset. However, generalization error is typically bounded in a non-adaptive…

Machine Learning · Computer Science 2015-11-11 Raef Bassily , Adam Smith , Thomas Steinke , Jonathan Ullman

Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on…

Machine Learning · Computer Science 2013-06-11 Francis Bach , Eric Moulines

Algorithmic Connections Between Active Learning and Stochastic Convex Optimization

Interesting theoretical associations have been established by recent papers between the fields of active learning and stochastic convex optimization due to the common role of feedback in sequential querying mechanisms. In this paper, we…

Machine Learning · Computer Science 2015-05-19 Aaditya Ramdas , Aarti Singh

Adaptive Sampling Strategies for Stochastic Optimization

In this paper, we propose a stochastic optimization method that adaptively controls the sample size used in the computation of gradient approximations. Unlike other variance reduction techniques that either require additional storage or the…

Optimization and Control · Mathematics 2017-11-01 Raghu Bollapragada , Richard Byrd , Jorge Nocedal

Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization

Online minimization of an unknown convex function over the interval $[0,1]$ is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point. Without knowing…

Machine Learning · Statistics 2020-02-21 Sattar Vakili , Sudeep Salgia , Qing Zhao

Adaptive Strategies in Non-convex Optimization

An algorithm is said to be adaptive to a certain parameter (of the problem) if it does not need a priori knowledge of such a parameter but performs competitively to those that know it. This dissertation presents our work on adaptive…

Machine Learning · Computer Science 2023-07-10 Zhenxun Zhuang

Adaptive Sampling for Convex Regression

In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences. We present a function-specific measure of…

Machine Learning · Computer Science 2018-08-28 Max Simchowitz , Kevin Jamieson , Jordan W. Suchow , Thomas L. Griffiths

Analysis of Noisy Evolutionary Optimization When Sampling Fails

In noisy evolutionary optimization, sampling is a common strategy to deal with noise. By the sampling strategy, the fitness of a solution is evaluated multiple times (called \emph{sample size}) independently, and its true fitness is then…

Neural and Evolutionary Computing · Computer Science 2022-11-29 Chao Qian , Chao Bian , Yang Yu , Ke Tang , Xin Yao

Lp and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective

An usual problem in statistics consists in estimating the minimizer of a convex function. When we have to deal with large samples taking values in high dimensional spaces, stochastic gradient algorithms and their averaged versions are…

Statistics Theory · Mathematics 2022-01-12 Antoine Godichon-Baggioni

Efficient Adaptive Data Analysis over Dense Distributions

Modern data workflows are inherently adaptive, repeatedly querying the same dataset to refine and validate sequential decisions, but such adaptivity can lead to overfitting and invalid statistical inference. Adaptive Data Analysis (ADA)…

Machine Learning · Computer Science 2026-02-10 Joon Suk Huh

First Order Stochastic Optimization with Oblivious Noise

We initiate the study of stochastic optimization with oblivious noise, broadly generalizing the standard heavy-tailed noise setup. In our setting, in addition to random observation noise, the stochastic gradient may be subject to…

Data Structures and Algorithms · Computer Science 2024-08-06 Ilias Diakonikolas , Sushrut Karmalkar , Jongho Park , Christos Tzamos

The Sample Complexity Of ERMs In Stochastic Convex Optimization

Stochastic convex optimization is one of the most well-studied models for learning in modern machine learning. Nevertheless, a central fundamental question in this setup remained unresolved: "How many data points must be observed so that…

Machine Learning · Computer Science 2023-11-10 Daniel Carmon , Roi Livni , Amir Yehudayoff

Instance-optimal stochastic convex optimization: Can we improve upon sample-average and robust stochastic approximation?

We study the unconstrained minimization of a smooth and strongly convex population loss function under a stochastic oracle that introduces both additive and multiplicative noise; this is a canonical and widely-studied setting that arises…

Optimization and Control · Mathematics 2026-03-27 Liwei Jiang , Ashwin Pananjady

A SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning

In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed…

Machine Learning · Statistics 2017-02-07 Aleksandr Aravkin , Damek Davis

Approximate Newton-based statistical inference using only stochastic gradients

We present a novel statistical inference framework for convex empirical risk minimization, using approximate stochastic Newton steps. The proposed algorithm is based on the notion of finite differences and allows the approximation of a…

Machine Learning · Computer Science 2019-02-06 Tianyang Li , Anastasios Kyrillidis , Liu Liu , Constantine Caramanis

Adaptive Data Analysis for Growing Data

Reuse of data in adaptive workflows poses challenges regarding overfitting and the statistical validity of results. Previous work has demonstrated that interacting with data via differentially private algorithms can mitigate overfitting,…

Machine Learning · Computer Science 2025-11-13 Neil G. Marchant , Benjamin I. P. Rubinstein