Related papers: The Threshold Breakdown Point
The breakdown point in its different variants is one of the central notions to quantify the global robustness of a procedure. We propose a simple supplementary variant which is useful in situations where we have no obvious or only partial…
We formalize notions of robustness for composite estimators via the notion of a breakdown point. A composite estimator successively applies two (or more) estimators: on data decomposed into disjoint parts, it applies the first estimator on…
Missing data is pervasive in econometric applications, and rarely is it plausible that the data are missing (completely) at random. This paper proposes a methodology for studying the robustness of results drawn from incomplete datasets.…
The concept of statistical depth extends the notions of the median and quantiles to other statistical models. These procedures aim to formalize the idea of identifying deeply embedded fits to a model that are less influenced by…
Robust inference based on the minimization of statistical divergences has proved to be a useful alternative to classical techniques based on maximum likelihood and related methods. Basu et al. (1998) introduced the density power divergence…
The minimum density power divergence estimator (MDPDE) has gained significant attention in the literature of robust inference due to its strong robustness properties and high asymptotic efficiency; it is relatively easy to compute and can…
ML-estimation based on mixtures of Normal distributions is a widely used tool for cluster analysis. However, a single outlier can make the parameter estimation of at least one of the mixture components break down. Among others, the…
We consider the fundamental problem of matching a template to a signal. We do so by M-estimation, which encompasses procedures that are robust to gross errors (i.e., outliers). Using standard results from empirical process theory, we derive…
Contamination can severely distort an estimator unless the estimation procedure is suitably robust. This is a well-known issue and has been addressed in Robust Statistics, however, the relation of contamination and distorted variable…
We study the basic task of mean estimation in the presence of mean-shift contamination. In the mean-shift contamination model, an adversary is allowed to replace a small constant fraction of the clean samples by samples drawn from…
We revisit the classical broken sample problem: Two samples of i.i.d. data points $\mathbf{X}=\{X_1,\cdots, X_n\}$ and $\mathbf{Y}=\{Y_1,\cdots,Y_m\}$ are observed without correspondence with $m\leq n$. Under the null hypothesis,…
The M-estimators of multivariate scatter are known to have breakdown points no greater than 1/(p+1), where p is the dimension of the data. In high dimension, the breakdown points are usually considered to be disappointingly low. This paper…
It is a common contention that it is an ``impossible mission'' to exactly determine the minimum sample size for the estimation of a binomial parameter with prescribed margin of error and confidence level. In this paper, we investigate such…
Good robust estimators can be tuned to combine a high breakdown point and a specified asymptotic efficiency at a central model. This happens in regression with MM- and tau-estimators among others. However, the finite-sample efficiency of…
Suppose that the normal model is used for data $Y_1,\ldots,Y_n$, but that the true distribution is a t-distribution with location and scale parameters $\xi$ and $\sigma$ and $m$ degrees of freedom. The normal model corresponds to…
Study samples often differ from the target populations of inference and policy decisions in non-random ways. Researchers typically believe that such departures from random sampling -- due to changes in the population over time and space, or…
We address the problem of detection and estimation of one or two change-points in the mean of a series of random variables. We use the formalism of set estimation in regression: To each point of a design is attached a binary label that…
For a sample of Exponentially distributed durations we aim at point estimation and a confidence interval for its parameter. A duration is only observed if it has ended within a certain time interval, determined by a Uniform distribution.…
A new framework is introduced for examining and evaluating the fundamental limits of lossless data compression, that emphasizes genuinely non-asymptotic results. The {\em sample complexity} of compressing a given source is defined as the…
Change point detection is becoming increasingly popular in many application areas. On one hand, most of the theoretically-justified methods are investigated in an ideal setting without model violations, or merely robust against identical…