Related papers: RIGID: Robust Linear Regression with Missing Data
This paper proposes a fast and accurate method for sparse regression in the presence of missing data. The underlying statistical model encapsulates the low-dimensional structure of the incomplete data matrix and the sparsity of the…
Reduced rank regression (RRR) is a fundamental tool for modeling multiple responses through low-dimensional latent structures, offering both interpretability and strong predictive performance in high-dimensional settings. Classical RRR…
We put forward a simple new randomized missing data (RMD) approach to robust filtering of state-space models, motivated by the idea that the inclusion of only a small fraction of available highly precise measurements can still extract most…
Statistical analysis on compositional data has gained a lot of attention due to their great potential of applications. A feature of these data is that they are multivariate vectors that lie in the simplex, that is, the components of each…
We consider the problem of linear fitting of noisy data in the case of broad (say $\alpha$-stable) distributions of random impacts ("noise"), which can lack even the first moment. This situation, common in statistical physics of small…
We introduce a user-friendly computational framework for implementing robust versions of a wide variety of structured regression methods with the L$_{2}$ criterion. In addition to introducing an algorithm for performing L$_{2}$E regression,…
We develop a novel approach to tackle the common but challenging problem of conformal inference for missing data in machine learning, focusing on Missing at Random (MAR) data. We propose a new procedure Conformal prediction for Missing data…
In this article, we investigate the robust optimal design problem for the prediction of response when the fitted regression models are only approximately specified, and observations might be missing completely at random. The intuitive idea…
The effectiveness of supervised learning techniques has made them ubiquitous in research and practice. In high-dimensional settings, supervised learning commonly relies on dimensionality reduction to improve performance and identify the…
In our paper, we focus on robust variable selection for missing data and measurement error. Missing data and measurement errors can lead to confusing data distribution. We propose an exponential loss function with a tuning parameter to…
We consider a variant of regression problem, where the correspondence between input and output data is not available. Such shuffled data is commonly observed in many real world problems. Taking flow cytometry as an example, the measuring…
In a missing-data setting, we have a sample in which a vector of explanatory variables x_i is observed for every subject i, while scalar outcomes y_i are missing by happenstance on some individuals. In this work we propose robust estimates…
Modern technologies are producing datasets with complex intrinsic structures, and they can be naturally represented as matrices instead of vectors. To preserve the latent data structures during processing, modern regression approaches…
When training predictive models on data with missing entries, the most widely used and versatile approach is a pipeline technique where we first impute missing entries and then compute predictions. In this paper, we view prediction with…
We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least squares regression problem. We show that this…
Functional data analysis is a fast evolving branch of modern statistics and the functional linear model has become popular in recent years. However, most estimation methods for this model rely on generalized least squares procedures and…
The ubiquity of missing values in real-world datasets poses a challenge for statistical inference and can prevent similar datasets from being analyzed in the same study, precluding many existing datasets from being used for new analyses.…
A novel framework is introduced to formalize identifiability in well-specified but ill-posed linear regression models. The framework is distribution-free and accommodates highly correlated features that may or may not relate to the…
While discriminative classifiers often yield strong predictive performance, missing feature values at prediction time can still be a challenge. Classifiers may not behave as expected under certain ways of substituting the missing values,…
We investigate robust linear regression where data may be contaminated by an oblivious adversary, i.e., an adversary than may know the data distribution but is otherwise oblivious to the realizations of the data samples. This model has been…