Related papers: Efficient Adaptive Data Analysis over Dense Distri…

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

The composition of pretraining data is a key determinant of foundation models' performance, but there is no standard guideline for allocating a limited computational budget across different data sources. Most current approaches either rely…

Machine Learning · Computer Science 2024-10-16 Yiding Jiang , Allan Zhou , Zhili Feng , Sadhika Malladi , J. Zico Kolter

Over-the-Air Federated Adaptive Data Analysis: Preserving Accuracy via Opportunistic Differential Privacy

Adaptive data analysis (ADA) involves a dynamic interaction between an analyst and a dataset owner, where the analyst submits queries sequentially, adapting them based on previous answers. This process can become adversarial, as the analyst…

Human-Computer Interaction · Computer Science 2025-01-22 Amir Hossein Hadavi , Mohammad M. Mojahedian , Mohammad Reza Aref

Adaptive Data Analysis in a Balanced Adversarial Model

In adaptive data analysis, a mechanism gets $n$ i.i.d. samples from an unknown distribution $D$, and is required to provide accurate estimations to a sequence of adaptively chosen statistical queries with respect to $D$. Hardt and Ullman…

Machine Learning · Computer Science 2023-11-07 Kobbi Nissim , Uri Stemmer , Eliad Tsfadia

AAA: an Adaptive Mechanism for Locally Differential Private Mean Estimation

Local differential privacy (LDP) is a strong privacy standard that has been adopted by popular software systems. The main idea is that each individual perturbs their own data locally, and only submits the resulting noisy version to a data…

Cryptography and Security · Computer Science 2024-04-04 Fei Wei , Ergute Bao , Xiaokui Xiao , Yin Yang , Bolin Ding

Paradise of Forking Paths: Revisiting the Adaptive Data Analysis Problem

The Adaptive Data Analysis (ADA) problem, where an analyst interacts with a dataset through statistical queries, is often studied under the assumption of adversarial analyst behavior. To decrease this gap, we propose a revised model of ADA…

Methodology · Statistics 2025-01-22 Amir Hossein Hadavi , Mohammad M. Mojahedian , Mohammad Reza Aref

Optimal Algorithms for Augmented Testing of Discrete Distributions

We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing,…

Machine Learning · Computer Science 2024-12-03 Maryam Aliakbarpour , Piotr Indyk , Ronitt Rubinfeld , Sandeep Silwal

Data assimilation and online optimization with performance guarantees

This paper considers a class of real-time stochastic optimization problems dependent on an unknown probability distribution. In the considered scenario, data is streaming frequently while trying to reach a decision. Thus, we aim to devise a…

Optimization and Control · Mathematics 2020-09-08 Dan Li , Sonia Martinez

Open-World Test-Time Adaptation with Hierarchical Feature Aggregation and Attention Affine

Test-time adaptation (TTA) refers to adjusting the model during the testing phase to cope with changes in sample distribution and enhance the model's adaptability to new environments. In real-world scenarios, models often encounter samples…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Ziqiong Liu , Yushun Tang , Junyang Ji , Zhihai He

ATA: Adaptive Task Allocation for Efficient Resource Management in Distributed Machine Learning

Asynchronous methods are fundamental for parallelizing computations in distributed machine learning. They aim to accelerate training by fully utilizing all available resources. However, their greedy approach can lead to inefficiencies using…

Machine Learning · Computer Science 2025-05-23 Artavazd Maranjyan , El Mehdi Saad , Peter Richtárik , Francesco Orabona

Adaptive Learning of Aggregate Analytics under Dynamic Workloads

Large organizations have seamlessly incorporated data-driven decision making in their operations. However, as data volumes increase, expensive big data infrastructures are called to rescue. In this setting, analytics tasks become very…

Databases · Computer Science 2020-03-17 Fotis Savva , Christos Anagnostopoulos , Peter Triantafillou

On Differential Privacy and Adaptive Data Analysis with Bounded Space

We study the space complexity of the two related fields of differential privacy and adaptive data analysis. Specifically, (1) Under standard cryptographic assumptions, we show that there exists a problem P that requires exponentially more…

Cryptography and Security · Computer Science 2023-02-14 Itai Dinur , Uri Stemmer , David P. Woodruff , Samson Zhou

Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy

We describe a framework for designing efficient active learning algorithms that are tolerant to random classification noise and are differentially-private. The framework is based on active learning algorithms that are statistical in the…

Machine Learning · Computer Science 2014-11-06 Maria Florina Balcan , Vitaly Feldman

Preventing False Discovery in Interactive Data Analysis is Hard

We show that, under a standard hardness assumption, there is no computationally efficient algorithm that given $n$ samples from an unknown distribution can give valid answers to $n^{3+o(1)}$ adaptively chosen statistical queries. A…

Machine Learning · Computer Science 2014-08-08 Moritz Hardt , Jonathan Ullman

Adaptive Data Analysis for Growing Data

Reuse of data in adaptive workflows poses challenges regarding overfitting and the statistical validity of results. Previous work has demonstrated that interacting with data via differentially private algorithms can mitigate overfitting,…

Machine Learning · Computer Science 2025-11-13 Neil G. Marchant , Benjamin I. P. Rubinstein

Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain

We revisit the problem of linear regression under a differential privacy constraint. By consolidating existing pieces in the literature, we clarify the correct dependence of the feature, label and coefficient domains in the optimization…

Machine Learning · Statistics 2018-07-10 Yu-Xiang Wang

Divide and Adapt: Active Domain Adaptation via Customized Learning

Active domain adaptation (ADA) aims to improve the model adaptation performance by incorporating active learning (AL) techniques to label a maximally-informative subset of target samples. Conventional AL methods do not consider the…

Computer Vision and Pattern Recognition · Computer Science 2023-07-24 Duojun Huang , Jichang Li , Weikai Chen , Junshi Huang , Zhenhua Chai , Guanbin Li

Local Differential Privacy for Distributed Stochastic Aggregative Optimization with Guaranteed Optimality

Distributed aggregative optimization underpins many cooperative optimization and multi-agent control systems, where each agent's objective function depends both on its local optimization variable and an aggregate of all agents' optimization…

Systems and Control · Electrical Eng. & Systems 2026-03-30 Ziqin Chen , Yongqiang Wang

An Improved Algorithm for Learning Drifting Discrete Distributions

We present a new adaptive algorithm for learning discrete distributions under distribution drift. In this setting, we observe a sequence of independent samples from a discrete distribution that is changing over time, and the goal is to…

Machine Learning · Computer Science 2024-03-11 Alessio Mazzetto

Distribution Alignment for Fully Test-Time Adaptation with Dynamic Online Data Streams

Given a model trained on source data, Test-Time Adaptation (TTA) enables adaptation and inference in test data streams with domain shifts from the source. Current methods predominantly optimize the model for each incoming test data batch…

Machine Learning · Computer Science 2024-07-18 Ziqiang Wang , Zhixiang Chi , Yanan Wu , Li Gu , Zhi Liu , Konstantinos Plataniotis , Yang Wang

Making Progress Based on False Discoveries

The study of adaptive data analysis examines how many statistical queries can be answered accurately using a fixed dataset while avoiding false discoveries (statistically inaccurate answers). In this paper, we tackle a question that…

Machine Learning · Computer Science 2023-02-09 Roi Livni