Related papers: Computing Robust Leverage Diagnostics when the Des…
With rapid advances in information technology, massive datasets are collected in all fields of science, such as biology, chemistry, and social science. Useful or meaningful information is extracted from these data often through statistical…
The dependency structure of multivariate data can be analyzed using the covariance matrix $\Sigma$. In many fields the precision matrix $\Sigma^{-1}$ is even more informative. As the sample covariance estimator is singular in…
Penalized logistic regression is extremely useful for binary classification with large number of covariates (higher than the sample size), having several real life applications, including genomic disease classification. However, the…
Cellwise outliers are widespread in data and traditional robust methods may fail when applied to datasets under such contamination. We propose a variable selection procedure, that uses a pairwise robust estimator to obtain an initial…
A robust estimator is proposed for the parameters that characterize the linear regression problem. It is based on the notion of shrinkages, often used in Finance and previously studied for outlier detection in multivariate data. A thorough…
Datasets containing both categorical and continuous variables are frequently encountered in many areas, and with the rapid development of modern measurement technologies, the dimensions of these variables can be very high. Despite the…
The robustness of classifiers has become a question of paramount importance in the past few years. Indeed, it has been shown that state-of-the-art deep learning architectures can easily be fooled with imperceptible changes to their inputs.…
High-dimensional data subject to heavy-tailed phenomena and heterogeneity are commonly encountered in various scientific fields and bring new challenges to the classical statistical methods. In this paper, we combine the asymmetric square…
This paper tackles the problem of robust covariance matrix estimation when the data is incomplete. Classical statistical estimation methodologies are usually built upon the Gaussian assumption, whereas existing robust estimation ones assume…
Among semiparametric regression models, partially linear additive models provide a useful tool to include additive nonparametric components as well as a parametric component, when explaining the relationship between the response and a set…
Errors-in-variables is a long-standing, difficult issue in linear regression; and progress depends in part on new identifying assumptions. I characterize measurement error as bad-leverage points and assume that fewer than half the sample…
Estimating covariance matrices with high-dimensional complex data presents significant challenges, particularly concerning positive definiteness, sparsity, and numerical stability. Existing robust sparse estimators often fail to guarantee…
A robust estimator for a wide family of mixtures of linear regression is presented. Robustness is based on the joint adoption of the Cluster Weighted Model and of an estimator based on trimming and restrictions. The selected model provides…
The parameters of the log-logistic distribution are generally estimated based on classical methods such as maximum likelihood estimation, whereas these methods usually result in severe biased estimates when the data contain outliers. In…
This paper addresses the problem of providing robust estimators under a functional logistic regression model. Logistic regression is a popular tool in classification problems with two populations. As in functional linear regression,…
The effectiveness of supervised learning techniques has made them ubiquitous in research and practice. In high-dimensional settings, supervised learning commonly relies on dimensionality reduction to improve performance and identify the…
We propose a robust variable selection procedure using a divergence based M-estimator combined with a penalty function. It produces robust estimates of the regression parameters and simultaneously selects the important explanatory…
We propose a residual randomization procedure designed for robust Lasso-based inference in the high-dimensional setting. Compared to earlier work that focuses on sub-Gaussian errors, the proposed procedure is designed to work robustly in…
Robust design is one of the main tools employed by engineers for the facilitation of the design of high-quality processes. However, most real-world processes invariably contend with external uncontrollable factors, often denoted as outliers…
Recent advances in deep learning have achieved impressive gains in classification accuracy on a variety of types of data, including images and text. Despite these gains, however, concerns have been raised about the calibration, robustness,…