English
Related papers

Related papers: Challenges in Bayesian Adaptive Data Analysis

200 papers

Most work on adaptive data analysis assumes that samples in the dataset are independent. When correlations are allowed, even the non-adaptive setting can become intractable, unless some structural constraints are imposed. To address this,…

Data Structures and Algorithms · Computer Science 2025-11-13 Emma Rapoport , Edith Cohen , Uri Stemmer

Reuse of data in adaptive workflows poses challenges regarding overfitting and the statistical validity of results. Previous work has demonstrated that interacting with data via differentially private algorithms can mitigate overfitting,…

Machine Learning · Computer Science 2025-11-13 Neil G. Marchant , Benjamin I. P. Rubinstein

Repeated use of a data sample via adaptively chosen queries can rapidly lead to overfitting, wherein the empirical evaluation of queries on the sample significantly deviates from their mean with respect to the underlying data distribution.…

Machine Learning · Computer Science 2024-04-26 Moshe Shenfeld , Katrina Ligett

The new field of adaptive data analysis seeks to provide algorithms and provable guarantees for models of machine learning that allow researchers to reuse their data, which normally falls outside of the usual statistical paradigm of static…

Machine Learning · Computer Science 2017-03-22 Sam Elder

Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model,…

Machine Learning · Computer Science 2015-11-10 Raef Bassily , Kobbi Nissim , Adam Smith , Thomas Steinke , Uri Stemmer , Jonathan Ullman

In adaptive data analysis, a mechanism gets $n$ i.i.d. samples from an unknown distribution $D$, and is required to provide accurate estimations to a sequence of adaptively chosen statistical queries with respect to $D$. Hardt and Ullman…

Machine Learning · Computer Science 2023-11-07 Kobbi Nissim , Uri Stemmer , Eliad Tsfadia

Adaptivity is an important feature of data analysis---typically the choice of questions asked about a dataset depends on previous interactions with the same dataset. However, generalization error is typically bounded in a non-adaptive…

Machine Learning · Computer Science 2015-11-11 Raef Bassily , Adam Smith , Thomas Steinke , Jonathan Ullman

The advent of modern technology, permitting the measurement of thousands of characteristics simultaneously, has given rise to floods of data characterized by many large or even huge datasets. This new paradigm presents extraordinary…

Methodology · Statistics 2019-02-14 A. M. Pires , J. A. Branco

Ensuring that analyses performed on a dataset are representative of the entire population is one of the central problems in statistics. Most classical techniques assume that the dataset is independent of the analyst's query and break down…

Machine Learning · Computer Science 2024-09-25 Guy Blanc

Approximate Bayesian computation performs approximate inference for models where likelihood computations are expensive or impossible. Instead simulations from the model are performed for various parameter values and accepted if they are…

Computation · Statistics 2015-12-16 Dennis Prangle

It is not always clear how to adjust for control data in causal inference, balancing the goals of reducing bias and variance. We show how, in a setting with repeated experiments, Bayesian hierarchical modeling yields an adaptive procedure…

Methodology · Statistics 2025-01-23 Andrew Gelman , Matthijs Vákár

Adaptive data analysis has posed a challenge to science due to its ability to generate false hypotheses on moderately large data sets. In general, with non-adaptive data analyses (where queries to the data are generated without being…

Methodology · Statistics 2018-09-18 Preetum Nakkiran , Jarosław Błasiok

We propose a novel technique for analyzing adaptive sampling called the {\em Simulator}. Our approach differs from the existing methods by considering not how much information could be gathered by any fixed sampling strategy, but how…

Machine Learning · Computer Science 2023-04-25 Max Simchowitz , Kevin Jamieson , Benjamin Recht

Sampling is a fundamental problem in computer science and statistics. However, for a given task and stream, it is often not possible to choose good sampling probabilities in advance. We derive a general framework for adaptively changing the…

Machine Learning · Statistics 2022-06-16 Daniel Ting

Data Science and Machine learning have been growing strong for the past decade. We argue that to make the most of this exciting field we should resist the temptation of assuming that forecasting can be reduced to brute-force data analytics.…

Artificial Intelligence · Computer Science 2020-05-12 Hykel Hosni , Angelo Vulpiani

We show that, under a standard hardness assumption, there is no computationally efficient algorithm that given $n$ samples from an unknown distribution can give valid answers to $n^{3+o(1)}$ adaptively chosen statistical queries. A…

Machine Learning · Computer Science 2014-08-08 Moritz Hardt , Jonathan Ullman

Adaptive Bayesian quadrature (ABQ) is a powerful approach to numerical integration that empirically compares favorably with Monte Carlo integration on problems of medium dimensionality (where non-adaptive quadrature is not competitive). Its…

Machine Learning · Statistics 2019-10-29 Motonobu Kanagawa , Philipp Hennig

The process of calibrating computer models of natural phenomena is essential for applications in the physical sciences, where plenty of domain knowledge can be embedded into simulations and then calibrated against real observations. Current…

Machine Learning · Computer Science 2025-01-20 Rafael Oliveira , Dino Sejdinovic , David Howard , Edwin V. Bonilla

In adaptive data analysis, the user makes a sequence of queries on the data, where at each step the choice of query may depend on the results in previous steps. The releases are often randomized in order to reduce overfitting for such…

Machine Learning · Statistics 2016-02-16 Yu-Xiang Wang , Jing Lei , Stephen E. Fienberg

Data assimilation refers to the problem of finding trajectories of a prescribed dynamical model in such a way that the output of the model (usually some function of the model states) follows a given time series of observations. Typically…

Atmospheric and Oceanic Physics · Physics 2015-05-30 Jochen Bröcker , Ivan G. Szendro
‹ Prev 1 2 3 10 Next ›