Related papers: Robustness to missing data: breakdown point analys…
We formalize notions of robustness for composite estimators via the notion of a breakdown point. A composite estimator successively applies two (or more) estimators: on data decomposed into disjoint parts, it applies the first estimator on…
In a missing-data setting, we have a sample in which a vector of explanatory variables x_i is observed for every subject i, while scalar outcomes y_i are missing by happenstance on some individuals. In this work we propose robust estimates…
We study the problem of robustly estimating the mean of a $d$-dimensional distribution given $N$ examples, where most coordinates of every example may be missing and $\varepsilon N$ examples may be arbitrarily corrupted. Assuming each…
As the issue of robustness in AI systems becomes vital, statistical learning techniques that are reliable even in presence of partly contaminated data have to be developed. Preference data, in the form of (complete) rankings in the simplest…
This paper considers an empirical likelihood inference for parameters defined by general estimating equations, when data are missing at random. The efficiency of existing estimators depends critically on correctly specifying the conditional…
While the traditional viewpoint in machine learning and statistics assumes training and testing samples come from the same population, practice belies this fiction. One strategy -- coming from robust statistics and optimization -- is thus…
The problem of robust mean estimation in high dimensions is studied, in which a certain fraction (less than half) of the datapoints can be arbitrarily corrupted. Motivated by compressive sensing, the robust mean estimation problem is…
Neural networks are an indispensable model class for many complex learning tasks. Despite the popularity and importance of neural networks and many different established techniques from literature for stabilization and robustification of…
Instance ranking problems intend to recover the true ordering of the instances in a data set with a variety of applications in for example scientific, social and financial contexts. Robust statistics studies the behaviour of estimators in…
In this article, we investigate the robust optimal design problem for the prediction of response when the fitted regression models are only approximately specified, and observations might be missing completely at random. The intuitive idea…
Study samples often differ from the target populations of inference and policy decisions in non-random ways. Researchers typically believe that such departures from random sampling -- due to changes in the population over time and space, or…
We present a robust framework to perform linear regression with missing entries in the features. By considering an elliptical data distribution, and specifically a multivariate normal model, we are able to conditionally formulate a…
Nonresponse after probability sampling is a universal challenge in survey sampling, often necessitating adjustments to mitigate sampling and selection bias simultaneously. This study explored the removal of bias and effective utilization of…
We introduce a criterion, resilience, which allows properties of a dataset (such as its mean or best low rank approximation) to be robustly computed, even in the presence of a large fraction of arbitrary additional data. Resilience is a…
We introduce new estimators for robust machine learning based on median-of-means (MOM) estimators of the mean of real valued random variables. These estimators achieve optimal rates of convergence under minimal assumptions on the dataset.…
When outcomes are missing for reasons beyond an investigator's control, there are two different ways to adjust a parameter estimate for covariates that may be related both to the outcome and to missingness. One approach is to model the…
The paper algorithmizes the problem of regime change point identification for data measured in a system exhibiting impulsive behaviors. This is a fundamental challenge for annotation of measurement data relevant, e.g., for designing…
A literature search shows that robust regression techniques are rarely used in applied econometrics. We list several misconceptions about robustness which lead to this situation. We show that most data sets are not normal, least squares…
Economic models may exhibit incompleteness depending on whether or not they admit certain policy-relevant features such as strategic interaction, self-selection, or state dependence. We develop a novel test of model incompleteness and…
For many inference problems in statistics and econometrics, the unknown parameter is identified by a set of moment conditions. A generic method of solving moment conditions is the Generalized Method of Moments (GMM). However, classical GMM…