Related papers: An Imputation-Consistency Algorithm for High-Dimen…

High Dimensional Expectation-Maximization Algorithm: Statistical Optimization and Asymptotic Normality

We provide a general theory of the expectation-maximization (EM) algorithm for inferring high dimensional latent variable models. In particular, we make two contributions: (i) For parameter estimation, we propose a novel high dimensional EM…

Machine Learning · Statistics 2015-01-28 Zhaoran Wang , Quanquan Gu , Yang Ning , Han Liu

Missing Value Imputation for Mixed Data via Gaussian Copula

Missing data imputation forms the first critical step of many data analysis pipelines. The challenge is greatest for mixed data sets, including real, Boolean, and ordinal data, where standard techniques for imputation fail basic sanity…

Methodology · Statistics 2020-06-17 Yuxuan Zhao , Madeleine Udell

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these…

Statistics Theory · Mathematics 2015-03-19 Po-Ling Loh , Martin J. Wainwright

A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing Data

This paper tackles the problem of missing data imputation for noisy and non-Gaussian data. A classical imputation method, the Expectation Maximization (EM) algorithm for Gaussian mixture models, has shown interesting properties when…

Machine Learning · Statistics 2023-05-23 Florian Mouret , Alexandre Hippert-Ferrer , Frédéric Pascal , Jean-Yves Tourneret

Pattern Alternating Maximization Algorithm for Missing Data in Large P, Small N Problems

We propose a new and computationally efficient algorithm for maximizing the observed log-likelihood for a multivariate normal data matrix with missing values. We show that our procedure based on iteratively regressing the missing on the…

Methodology · Statistics 2012-11-21 Nicolas Städler , Daniel J. Stekhoven , Peter Bühlmann

Estimation and imputation of missing data in longitudinal models with Zero-Inflated Poisson response variable

This research deals with the estimation and imputation of missing data in longitudinal models with a Poisson response variable inflated with zeros. A methodology is proposed that is based on the use of maximum likelihood, assuming that data…

Methodology · Statistics 2024-09-18 D. S. Martinez-Lobo , O. O. Melo , N. A. Cruz

DPER: Efficient Parameter Estimation for Randomly Missing Data

The missing data problem has been broadly studied in the last few decades and has various applications in different areas such as statistics or bioinformatics. Even though many methods have been developed to tackle this challenge, most of…

Machine Learning · Statistics 2021-06-10 Thu Nguyen , Khoi Minh Nguyen-Duy , Duy Ho Minh Nguyen , Binh T. Nguyen , Bruce Alan Wade

Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data

Missing data are ubiquitous in real world applications and, if not adequately handled, may lead to the loss of information and biased findings in downstream analysis. Particularly, high-dimensional incomplete data with a moderate sample…

Machine Learning · Computer Science 2022-12-23 Zongyu Dai , Zhiqi Bu , Qi Long

Imputation for High-Dimensional Linear Regression

We study high-dimensional regression with missing entries in the covariates. A common strategy in practice is to \emph{impute} the missing entries with an appropriate substitute and then implement a standard statistical procedure acting as…

Statistics Theory · Mathematics 2020-01-28 Kabir Aladin Chandrasekher , Ahmed El Alaoui , Andrea Montanari

EMFlow: Data Imputation in Latent Space via EM and Deep Flow Models

The presence of missing values within high-dimensional data is an ubiquitous problem for many applied sciences. A serious limitation of many available data mining and machine learning methods is their inability to handle partially missing…

Machine Learning · Computer Science 2022-08-02 Qi Ma , Sujit K. Ghosh

Inference of stochastic time series with missing data

Inferring dynamics from time series is an important objective in data analysis. In particular, it is challenging to infer stochastic dynamics given incomplete data. We propose an expectation maximization (EM) algorithm that iterates between…

Data Analysis, Statistics and Probability · Physics 2021-08-25 Sangwon Lee , Vipul Periwal , Junghyo Jo

Multiple Imputation Method for High-Dimensional Neuroimaging Data

Missingness is a common issue for neuroimaging data, and neglecting it in downstream statistical analysis can introduce bias and lead to misguided inferential conclusions. It is therefore crucial to conduct appropriate statistical methods…

Methodology · Statistics 2025-03-25 Tong Lu , Chixiang Chen , Hsin-Hsiung Huang , Peter Kochunov , Elliot Hong , Shuo Chen

Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

Methodology · Statistics 2021-10-26 Luis Alejandro Masmela-Caita , Thais Paiva Galletti , Marcos Oliveira Prates

Calibrated imputation of numerical data under linear edit restrictions

A common problem faced by statistical institutes is that data may be missing from collected data sets. The typical way to overcome this problem is to impute the missing data. The problem of imputing missing data is complicated by the fact…

Applications · Statistics 2014-01-09 Jeroen Pannekoek , Natalie Shlomo , Ton De Waal

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as theArbitrary missing pattern. Additionally, this…

Artificial Intelligence · Computer Science 2016-07-04 Collins Leke , Tshilidzi Marwala

Explainable Data Imputation using Constraints

Data values in a dataset can be missing or anomalous due to mishandling or human error. Analysing data with missing values can create bias and affect the inferences. Several analysis methods, such as principle components analysis or…

Artificial Intelligence · Computer Science 2022-05-11 Sandeep Hans , Diptikalyan Saha , Aniya Aggarwal

Geometry of EM and related iterative algorithms

The Expectation--Maximization (EM) algorithm is a simple meta-algorithm that has been used for many years as a methodology for statistical inference when there are missing measurements in the observed data or when the data is composed of…

Machine Learning · Statistics 2022-11-15 Hideitsu Hino , Shotaro Akaho , Noboru Murata

High-dimensional estimation with missing data: Statistical and computational limits

We consider computationally-efficient estimation of population parameters when observations are subject to missing data. In particular, we consider estimation under the realizable contamination model of missing data in which an $\epsilon$…

Statistics Theory · Mathematics 2026-03-18 Kabir Aladin Verchand , Ankit Pensia , Saminul Haque , Rohith Kuditipudi

Missing Data: A Comparison of Neural Network and Expectation Maximisation Techniques

The estimation of missing input vector elements in real time processing applications requires a system that possesses the knowledge of certain characteristics such as correlations between variables, which are inherent in the input space.…

Applications · Statistics 2007-05-23 Fulufhelo V. Nelwamondo , Shakir Mohamed , Tshilidzi Marwala

EPEM: Efficient Parameter Estimation for Multiple Class Monotone Missing Data

The problem of monotone missing data has been broadly studied during the last two decades and has many applications in different fields such as bioinformatics or statistics. Commonly used imputation techniques require multiple iterations…

Machine Learning · Computer Science 2020-09-25 Thu Nguyen , Duy H. M. Nguyen , Huy Nguyen , Binh T. Nguyen , Bruce A. Wade