Related papers: Learning Individual Models for Imputation (Technic…

What's a good imputation to predict with missing values?

How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical…

Machine Learning · Statistics 2021-12-01 Marine Le Morvan , Julie Josse , Erwan Scornet , Gaël Varoquaux

Certain and Approximately Certain Models for Statistical Learning

Real-world data is often incomplete and contains missing values. To train accurate models over real-world datasets, users need to spend a substantial amount of time and resources imputing and finding proper values for missing data items. In…

Machine Learning · Statistics 2024-03-05 Cheng Zhen , Nischal Aryal , Arash Termehchy , Alireza Aghasi , Amandeep Singh Chabada

HMVI: Unifying Heterogeneous Attributes with Natural Neighbors for Missing Value Inference

Missing value imputation is a fundamental challenge in machine intelligence, heavily dependent on data completeness. Current imputation methods often handle numerical and categorical attributes independently, overlooking critical…

Machine Learning · Computer Science 2026-01-09 Xiaopeng Luo , Zexi Tan , Zhuowei Wang

Impugan: Learning Conditional Generative Models for Robust Data Imputation

Incomplete data are common in real-world applications. Sensors fail, records are inconsistent, and datasets collected from different sources often differ in scale, sampling rate, and quality. These differences create missing values that…

Machine Learning · Computer Science 2025-12-08 Zalish Mahmud , Anantaa Kotal , Aritran Piplai

Adaptive imputation of missing values for incomplete pattern classification

In classification of incomplete pattern, the missing values can either play a crucial role in the class determination, or have only little influence (or eventually none) on the classification results according to the context. We propose a…

Artificial Intelligence · Computer Science 2016-02-09 Zhun-Ga Liu , Quan Pan , Jean Dezert , Arnaud Martin

Variable Selection for Linear Regression Imputation in Surveys

Survey sampling is concerned with the estimation of finite population parameters. In practice, survey data suffer from item nonresponse, which is commonly handled through imputation, i.e., replacing missing values with predicted values. As…

Methodology · Statistics 2026-03-06 Ziming An , Mehdi Dagdoug , David Haziza

On the Relation between Prediction and Imputation Accuracy under Missing Covariates

Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for…

Machine Learning · Statistics 2022-03-23 Burim Ramosaj , Justus Tulowietzki , Markus Pauly

Imputation Uncertainty in Interpretable Machine Learning Methods

In real data, missing values occur frequently, which affects the interpretation with interpretable machine learning (IML) methods. Recent work considers bias and shows that model explanations may differ between imputation methods, while…

Machine Learning · Statistics 2025-12-22 Pegah Golchian , Marvin N. Wright

Optimized Linear Imputation

Often in real-world datasets, especially in high dimensional data, some feature values are missing. Since most data analysis and statistical methods do not handle gracefully missing values, the first step in the analysis requires the…

Machine Learning · Statistics 2016-12-08 Yehezkel S. Resheff , Daphna Weinshall

Imputations for High Missing Rate Data in Covariates via Semi-supervised Learning Approach

Advancements in data collection techniques and the heterogeneity of data resources can yield high percentages of missing observations on variables, such as block-wise missing data. Under missing-data scenarios, traditional methods such as…

Methodology · Statistics 2022-05-17 Wei Lan , Xuerong Chen , Tao Zou , Chih-Ling Tsai

Multi-environment Invariance Learning with Missing Data

Learning models that can handle distribution shifts is a key challenge in domain generalization. Invariance learning, an approach that focuses on identifying features invariant across environments, improves model generalization by capturing…

Machine Learning · Statistics 2026-05-11 Yiran Jia , Jelena Bradic

Variational Bayesian Multiple Imputation in High-Dimensional Regression Models With Missing Responses

Multiple imputation has become one of the standard methods in drawing inferences in many incomplete data applications. Applications of multiple imputation in relatively more complex settings, such as high-dimensional clustered data, require…

Methodology · Statistics 2025-04-08 Qiushuang Li , Recai Yucel

HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer…

Machine Learning · Statistics 2022-06-17 Daniel Jarrett , Bogdan Cebere , Tennison Liu , Alicia Curth , Mihaela van der Schaar

On the consistency of supervised learning with missing values

In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…

Machine Learning · Statistics 2024-03-22 Julie Josse , Jacob M. Chen , Nicolas Prost , Erwan Scornet , Gaël Varoquaux

Missing Features Reconstruction and Its Impact on Classification Accuracy

In real-world applications, we can encounter situations when a well-trained model has to be used to predict from a damaged dataset. The damage caused by missing or corrupted values can be either on the level of individual instances or on…

Machine Learning · Computer Science 2019-11-12 Magda Friedjungová , Daniel Vašata , Marcel Jiřina

Nearest Neighbor Imputation for Categorical Data by Weighting of Attributes

Missing values are a common phenomenon in all areas of applied research. While various imputation methods are available for metrically scaled variables, methods for categorical data are scarce. An imputation method that has been shown to…

Methodology · Statistics 2017-10-04 Shahla Faisal , Gerhard Tutz

Selective Sampling and Imitation Learning via Online Regression

We consider the problem of Imitation Learning (IL) by actively querying noisy expert for feedback. While imitation learning has been empirically successful, much of prior work assumes access to noiseless expert feedback which is not…

Machine Learning · Computer Science 2023-07-12 Ayush Sekhari , Karthik Sridharan , Wen Sun , Runzhe Wu

Learning Planar Ising Models

Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models,…

Machine Learning · Statistics 2015-02-04 Jason K. Johnson , Diane Oyen , Michael Chertkov , Praneeth Netrapalli

Classification of datasets with imputed missing values: does imputation quality matter?

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…

Machine Learning · Computer Science 2023-12-20 Tolou Shadbahr , Michael Roberts , Jan Stanczuk , Julian Gilbey , Philip Teare , Sören Dittmer , Matthew Thorpe , Ramon Vinas Torne , Evis Sala , Pietro Lio , Mishal Patel , AIX-COVNET Collaboration , James H. F. Rudd , Tuomas Mirtti , Antti Rannikko , John A. D. Aston , Jing Tang , Carola-Bibiane Schönlieb

Imputation for prediction: beware of diminishing returns

Missing values are prevalent across various fields, posing challenges for training and deploying predictive models. In this context, imputation is a common practice, driven by the hope that accurate imputations will enhance predictions.…

Artificial Intelligence · Computer Science 2025-02-21 Marine Le Morvan , Gaël Varoquaux