Related papers: Robust Kernel Hypothesis Testing under Data Corrup…
The problem of robust hypothesis testing is studied, where under the null and the alternative hypotheses, the data-generating distributions are assumed to be in some uncertainty sets, and the goal is to design a test that performs well…
Recent years have witnessed growing concerns about the privacy of sensitive data. In response to these concerns, differential privacy has emerged as a rigorous framework for privacy protection, gaining widespread recognition in both…
This paper provides a unifying view of optimal kernel hypothesis testing across the MMD two-sample, HSIC independence, and KSD goodness-of-fit frameworks. Minimax optimal separation rates in the kernel and $L^2$ metrics are presented, with…
Permutation tests are widely used in statistics, providing a finite-sample guarantee on the type I error rate whenever the distribution of the samples under the null hypothesis is invariant to some rearrangement. Despite its increasing…
We consider a general statistical learning problem where an unknown fraction of the training data is corrupted. We develop a robust learning method that only requires specifying an upper bound on the corrupted data fraction. The method…
Kernel two-sample tests have been widely used, and the development of efficient methods for high-dimensional, large-scale data is receiving increasing attention in the big data era. However, existing methods, such as the maximum mean…
We develop a novel computationally efficient and general framework for robust hypothesis testing. The new framework features a new way to construct uncertainty sets under the null and the alternative distributions, which are sets centered…
The problem of robust binary hypothesis testing is studied. Under both hypotheses, the data-generating distributions are assumed to belong to uncertainty sets constructed through moments; in particular, the sets contain distributions whose…
We study the problem of robust estimation under heterogeneous corruption rates, where each sample may be independently corrupted with a known but non-identical probability. This setting arises naturally in distributed and federated…
We consider a data-driven robust hypothesis test where the optimal test will minimize the worst-case performance regarding distributions that are close to the empirical distributions with respect to the Wasserstein distance. This leads to a…
Permutation tests are widely recognized as robust alternatives to tests based on normal theory. Random permutation tests have been frequently employed to assess the significance of variables in linear models. Despite their widespread use,…
Model misspecification can create significant challenges for the implementation of probabilistic models, and this has led to development of a range of robust methods which directly account for this issue. However, whether these more…
In today's era of big data, robust least-squares regression becomes a more challenging problem when considering the adversarial corruption along with explosive growth of datasets. Traditional robust methods can handle the noise but suffer…
Invariance to a broad array of image corruptions, such as warping, noise, or color shifts, is an important aspect of building robust models in computer vision. Recently, several new data augmentations have been proposed that significantly…
Numerous modern optimization and machine learning algorithms rely on subgradient information being trustworthy and hence, they may fail to converge when such information is corrupted. In this paper, we consider the setting where subgradient…
Kernel-based tests provide a simple yet effective framework that use the theory of reproducing kernel Hilbert spaces to design non-parametric testing procedures. In this paper we propose new theoretical tools that can be used to study the…
In the problem of composite hypothesis testing, identifying the potential uniformly most powerful (UMP) unbiased test is of great interest. Beyond typical hypothesis settings with exponential family, it is usually challenging to prove the…
We study the problem of robust mean estimation and introduce a novel Hamming distance-based measure of distribution shift for coordinate-level corruptions. We show that this measure yields adversary models that capture more realistic…
P-hacking poses challenges to traditional hypothesis testing. In this paper, we propose a robust method for the one-sample significance test that can protect against p-hacking from sample manipulation. Precisely, assuming a sequential…
We present a new framework to address the non-convex robust hypothesis testing problem, wherein the goal is to seek the optimal detector that minimizes the maximum of worst-case type-I and type-II risk functions. The distributional…