English
Related papers

Related papers: Adaptive Data Analysis for Growing Data

200 papers

Overfitting is the bane of data analysts, even when data are plentiful. Formal approaches to understanding this problem focus on statistical inference and generalization of individual analysis procedures. Yet the practice of data analysis…

Machine Learning · Computer Science 2015-09-28 Cynthia Dwork , Vitaly Feldman , Moritz Hardt , Toniann Pitassi , Omer Reingold , Aaron Roth

Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model,…

Machine Learning · Computer Science 2015-11-10 Raef Bassily , Kobbi Nissim , Adam Smith , Thomas Steinke , Uri Stemmer , Jonathan Ullman

Most work on adaptive data analysis assumes that samples in the dataset are independent. When correlations are allowed, even the non-adaptive setting can become intractable, unless some structural constraints are imposed. To address this,…

Data Structures and Algorithms · Computer Science 2025-11-13 Emma Rapoport , Edith Cohen , Uri Stemmer

Large organizations have seamlessly incorporated data-driven decision making in their operations. However, as data volumes increase, expensive big data infrastructures are called to rescue. In this setting, analytics tasks become very…

Databases · Computer Science 2020-03-17 Fotis Savva , Christos Anagnostopoulos , Peter Triantafillou

Traditional statistical analysis requires that the analysis process and data are independent. By contrast, the new field of adaptive data analysis hopes to understand and provide algorithms and accuracy guarantees for research as it is…

Machine Learning · Computer Science 2017-03-22 Sam Elder

Adaptivity is an important feature of data analysis---typically the choice of questions asked about a dataset depends on previous interactions with the same dataset. However, generalization error is typically bounded in a non-adaptive…

Machine Learning · Computer Science 2015-11-11 Raef Bassily , Adam Smith , Thomas Steinke , Jonathan Ullman

Adaptive data analysis is frequently criticized for its pessimistic generalization guarantees. The source of these pessimistic bounds is a model that permits arbitrary, possibly adversarial analysts that optimally use information to bias…

Machine Learning · Computer Science 2019-05-14 Tijana Zrnic , Moritz Hardt

In adaptive data analysis, the user makes a sequence of queries on the data, where at each step the choice of query may depend on the results in previous steps. The releases are often randomized in order to reduce overfitting for such…

Machine Learning · Statistics 2016-02-16 Yu-Xiang Wang , Jing Lei , Stephen E. Fienberg

Adaptive data analysis has posed a challenge to science due to its ability to generate false hypotheses on moderately large data sets. In general, with non-adaptive data analyses (where queries to the data are generated without being…

Methodology · Statistics 2018-09-18 Preetum Nakkiran , Jarosław Błasiok

The vast majority of the work on adaptive data analysis focuses on the case where the samples in the dataset are independent. Several approaches and tools have been successfully applied in this context, such as differential privacy,…

Machine Learning · Computer Science 2022-01-24 Aryeh Kontorovich , Menachem Sadigurschi , Uri Stemmer

Ensuring that analyses performed on a dataset are representative of the entire population is one of the central problems in statistics. Most classical techniques assume that the dataset is independent of the analyst's query and break down…

Machine Learning · Computer Science 2024-09-25 Guy Blanc

Repeated use of a data sample via adaptively chosen queries can rapidly lead to overfitting, wherein the empirical evaluation of queries on the sample significantly deviates from their mean with respect to the underlying data distribution.…

Machine Learning · Computer Science 2024-04-26 Moshe Shenfeld , Katrina Ligett

Adaptive data analysis (ADA) involves a dynamic interaction between an analyst and a dataset owner, where the analyst submits queries sequentially, adapting them based on previous answers. This process can become adversarial, as the analyst…

Human-Computer Interaction · Computer Science 2025-01-22 Amir Hossein Hadavi , Mohammad M. Mojahedian , Mohammad Reza Aref

Modern data is messy and high-dimensional, and it is often not clear a priori what are the right questions to ask. Instead, the analyst typically needs to use the data to search for interesting analyses to perform and hypotheses to test.…

Machine Learning · Statistics 2019-10-09 Daniel Russo , James Zou

Datasets are often reused to perform multiple statistical analyses in an adaptive way, in which each analysis may depend on the outcomes of previous analyses on the same dataset. Standard statistical guarantees do not account for these…

Machine Learning · Computer Science 2017-06-19 Vitaly Feldman , Thomas Steinke

Adaptive random search approaches have been shown to be effective for global optimization problems, where under certain conditions, the expected performance time increases only linearly with dimension. However, previous analyses assume that…

Optimization and Control · Mathematics 2022-03-22 David D. Linz , Zelda B. Zabinsky

In this work, we study how to use sampling to speed up mechanisms for answering adaptive queries into datasets without reducing the accuracy of those mechanisms. This is important to do when both the datasets and the number of queries asked…

Machine Learning · Computer Science 2020-01-03 Benjamin Fish , Lev Reyzin , Benjamin I. P. Rubinstein

Data-driven and adaptive control approaches face the problem of introducing sudden distributional shifts beyond the distribution of data encountered during learning. Therefore, they are prone to invalidating the very assumptions used in…

Systems and Control · Electrical Eng. & Systems 2025-08-25 Mohammad Ramadan , Evan Toler , Mihai Anitescu

Modern data workflows are inherently adaptive, repeatedly querying the same dataset to refine and validate sequential decisions, but such adaptivity can lead to overfitting and invalid statistical inference. Adaptive Data Analysis (ADA)…

Machine Learning · Computer Science 2026-02-10 Joon Suk Huh

The phenomenon of data distribution evolving over time has been observed in a range of applications, calling the needs of adaptive learning algorithms. We thus study the problem of supervised gradual domain adaptation, where labeled data…

Machine Learning · Computer Science 2022-11-15 Jing Dong , Shiji Zhou , Baoxiang Wang , Han Zhao
‹ Prev 1 2 3 10 Next ›