English
Related papers

Related papers: NeuMiss networks: differentiable programming for s…

200 papers

How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical…

Machine Learning · Statistics 2021-12-01 Marine Le Morvan , Julie Josse , Erwan Scornet , Gaël Varoquaux

Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...). In fact, the very nature of missing values usually…

Machine Learning · Statistics 2022-02-04 Alexis Ayme , Claire Boyer , Aymeric Dieuleveut , Erwan Scornet

Conditions ensuring optimal parameter estimation in the presence of missing data are well established in inference, typically relying on the Missing-at-Random (MAR) assumption. In prediction, similar principles are often assumed to apply.…

Methodology · Statistics 2026-03-19 Pierre Catoire , Robin Genuer , Cecile Proust-Lima

Missing data are a common problem for both the construction and implementation of a prediction algorithm. Pattern mixture kernel submodels (PMKS) - a series of submodels for every missing data pattern that are fit using only data from that…

Methodology · Statistics 2017-04-27 Sarah Fletcher Mercaldo , Jeffrey D. Blume

To analyze unstructured data (text, images, audio, video), economists typically first extract low-dimensional structured features with a neural network. Neural networks do not make generically unbiased predictions, and biases will propagate…

Econometrics · Economics 2026-02-20 Jacob Carlson , Melissa Dell

Missing data is a ubiquitous challenge in data analysis, often leading to biased and inaccurate results. Traditional imputation methods usually assume that the missingness mechanism is missing-at-random (MAR), where the missingness is…

Methodology · Statistics 2026-03-30 Huiming Xie , Fei Xue , Xiao Wang

We investigate methods for parameter learning from incomplete data that is not missing at random. Likelihood-based methods then require the optimization of a profile likelihood that takes all possible missingness mechanisms into account.…

Methodology · Statistics 2012-07-02 Manfred Jaeger

We consider building predictors when the data have missing values. We study the seemingly-simple case where the target to predict is a linear function of the fully-observed data and we show that, in the presence of missing values, the…

Machine Learning · Computer Science 2020-07-02 Marine Le Morvan , Nicolas Prost , Julie Josse , Erwan Scornet , Gaël Varoquaux

Conducting valid statistical analyses is challenging in the presence of missing-not-at-random (MNAR) data, where the missingness mechanism is dependent on the missing values themselves even conditioned on the observed data. Here, we…

Methodology · Statistics 2023-06-13 Anna Guo , Jiwei Zhao , Razieh Nabi

Missing data frequently arises across diverse domains, including time-series and image domains. In the real world, missing occurrences often depend on the unobservable values themselves, which are referred to as Missing Not at Random…

Machine Learning · Computer Science 2026-05-26 Gyuwon Sim , Sumin Lee , Heesun Bae , Byeonghu Na , Doyun Kwon , Ju-Hee Hwang , Jae-Young Lim , Il-Chul Moon

Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as…

Machine Learning · Computer Science 2023-11-27 Lena Stempfle , Ashkan Panahi , Fredrik D. Johansson

Matrix completion is often applied to data with entries missing not at random (MNAR). For example, consider a recommendation system where users tend to only reveal ratings for items they like. In this case, a matrix completion method that…

Machine Learning · Statistics 2019-10-30 Wei Ma , George H. Chen

Handling missing node features is a key challenge for deploying Graph Neural Networks (GNNs) in real-world domains such as healthcare and sensor networks. Existing studies mostly address relatively benign scenarios, namely benchmark…

Data with missing values is ubiquitous in many applications. Recent years have witnessed increasing attention on prediction with only incomplete data consisting of observed features and a mask that indicates the missing pattern. Existing…

Machine Learning · Computer Science 2023-05-22 Yichen Zhu , Jian Yuan , Bo Jiang , Tao Lin , Haiming Jin , Xinbing Wang , Chenghu Zhou

In the missing data literature, the Maximum Likelihood Estimator (MLE) is celebrated for its ignorability property under missing at random (MAR) data. However, its sensitivity to misspecification of the (complete) data model, even under…

Methodology · Statistics 2025-09-23 Badr-Eddine Chérief-Abdellatif , Jeffrey Näf

In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…

Machine Learning · Statistics 2024-03-22 Julie Josse , Jacob M. Chen , Nicolas Prost , Erwan Scornet , Gaël Varoquaux

Advances in artificial intelligence (AI) and deep learning have led to neural networks being used to generate lightning-speed answers to complex science questions, paintings in the style of Monet, or stories like those of Twain. Leveraging…

Methodology · Statistics 2026-02-25 Matthew Sainsbury-Dale , Andrew Zammit-Mangion , Noel Cressie , Raphaël Huser

The estimation of missing input vector elements in real time processing applications requires a system that possesses the knowledge of certain characteristics such as correlations between variables, which are inherent in the input space.…

Applications · Statistics 2007-05-23 Fulufhelo V. Nelwamondo , Shakir Mohamed , Tshilidzi Marwala

While data are the primary fuel for machine learning models, they often suffer from missing values, especially when collected in real-world scenarios. However, many off-the-shelf machine learning models, including artificial neural network…

We propose an efficient family of algorithms to learn the parameters of a Bayesian network from incomplete data. In contrast to textbook approaches such as EM and the gradient method, our approach is non-iterative, yields closed form…

Machine Learning · Computer Science 2014-11-26 Guy Van den Broeck , Karthika Mohan , Arthur Choi , Judea Pearl
‹ Prev 1 2 3 10 Next ›