Related papers: Conformal Inference For Missing Data under Multipl…
Missing data are inevitable in longitudinal studies. Traditional methods, such as the full information maximum likelihood (FIML), are commonly used to handle ignorable missing data. However, they may lead to biased model estimation due to…
Missing data is an universal problem in statistics. We develop a unified framework for estimating parameters defined by general estimating equations under a missing-at-random (MAR) mechanism, based on generalized entropy calibration…
In the presence of a missing response, reweighting the complete case subsample by the inverse of nonmissing probability is both intuitive and easy to implement. When the population totals of some auxiliary variables are known and when the…
Pre-trained machine learning (ML) predictions have been increasingly used to complement incomplete data to enable downstream scientific inquiries, but their naive integration risks biased inferences. Recently, multiple methods have been…
Missing data can pose a challenge for machine learning (ML) modeling. To address this, current approaches are categorized into feature imputation and label prediction and are primarily focused on handling missing data to enhance ML…
Conformal unlearning aims to ensure that a trained conformal predictor miscovers data points with specific shared characteristics, such as those from a particular label class, associated with a specific user, or belonging to a defined…
In the missing data literature, the Maximum Likelihood Estimator (MLE) is celebrated for its ignorability property under missing at random (MAR) data. However, its sensitivity to misspecification of the (complete) data model, even under…
We present a robust framework to perform linear regression with missing entries in the features. By considering an elliptical data distribution, and specifically a multivariate normal model, we are able to conditionally formulate a…
The missing data problem has been broadly studied in the last few decades and has various applications in different areas such as statistics or bioinformatics. Even though many methods have been developed to tackle this challenge, most of…
Missing data in supervised learning is well-studied, but the specific issue of missing labels during model evaluation has been overlooked. Ignoring samples with missing values, a common solution, can introduce bias, especially when data is…
The use of flexible machine-learning (ML) models to generate imputations of missing data within the framework of Multiple Imputation (MI) has recently gained traction, particularly in observational settings. For randomised controlled trials…
We propose an l1-regularized likelihood method for estimating the inverse covariance matrix in the high-dimensional multivariate normal model in presence of missing data. Our method is based on the assumption that the data are missing at…
Efficient estimation methods for simultaneous autoregressive (SAR) models with missing data in the response variable have been well-explored in the literature. A common practice is to introduce measurement error into SAR models to separate…
Quantifying the data uncertainty in learning tasks is often done by learning a prediction interval or prediction set of the label given the input. Two commonly desired properties for learned prediction sets are \emph{valid coverage} and…
Multimodal models trained on complete modality data often exhibit a substantial decrease in performance when faced with imperfect data containing corruptions or missing modalities. To address this robustness challenge, prior methods have…
Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored…
Matrix completion aims to estimate missing entries in a data matrix, using the assumption of a low-complexity structure (e.g., low rank) so that imputation is possible. While many effective estimation algorithms exist in the literature,…
Conformal inference provides a rigorous statistical framework for uncertainty quantification in machine learning, enabling well-calibrated prediction sets with precise coverage guarantees for any classification model. However, its reliance…
In a missing-data setting, we have a sample in which a vector of explanatory variables x_i is observed for every subject i, while scalar outcomes y_i are missing by happenstance on some individuals. In this work we propose robust estimates…
Conformal prediction is a theoretically grounded framework for constructing predictive intervals. We study conformal prediction with missing values in the covariates -- a setting that brings new challenges to uncertainty quantification. We…