English
Related papers

Related papers: In-Database Data Imputation

200 papers

Multiple imputation (MI) is a popular approach for dealing with missing data arising from non-response in sample surveys. Multiple imputation by chained equations (MICE) is one of the most widely used MI algorithms for multivariate data,…

Machine Learning · Computer Science 2022-03-22 Zhenhua Wang , Olanrewaju Akande , Jason Poulos , Fan Li

Missing data is a prevalent issue in many applications, including large medical registries such as the Swedish Healthcare Quality Registries, potentially leading to biased or inefficient analyses if not handled properly. Multiple Imputation…

Missing data remains a very common problem in large datasets, including survey and census data containing many ordinal responses, such as political polls and opinion surveys. Multiple imputation (MI) is usually the go-to approach for…

Methodology · Statistics 2024-12-25 Chayut Wongkamthong , Olanrewaju Akande

Missing data are ubiquitous in empirical databases, yet statistical analyses typically require complete data matrices. Multiple imputation offers a principled solution for filling these gaps. This study evaluates the performance of several…

Computation · Statistics 2026-02-05 Enzo Porto Brasil

Missing values in tabular data restrict the use and performance of machine learning, requiring the imputation of missing values. The most popular imputation algorithm is arguably multiple imputations using chains of equations (MICE), which…

Machine Learning · Computer Science 2022-03-01 Manar D Samad , Sakib Abrar , Norou Diawara

Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data…

Multiple imputation by chained equations (MICE) has emerged as a popular approach for handling missing data. A central challenge for applying MICE is determining how to incorporate outcome information into covariate imputation models,…

Methodology · Statistics 2019-10-11 Lauren Beesley , Jeremy M G Taylor

Healthcare time series data is vital for monitoring patient activity but often contains noise and missing values due to various reasons such as sensor errors or data interruptions. Imputation, i.e., filling in the missing values, is a…

Machine Learning · Computer Science 2024-12-17 Lien P. Le , Xuan-Hien Nguyen Thi , Thu Nguyen , Michael A. Riegler , Pål Halvorsen , Binh T. Nguyen

Healthcare data, particularly in critical care settings, presents three key challenges for analysis. First, physiological measurements come from different sources but are inherently related. Yet, traditional methods often treat each…

Applications · Statistics 2025-12-01 Ali Akbar Septiandri , Deyu Ming , F. Alejandro DiazDelaO , Takoua Jendoubi , Samiran Ray

A common approach for handling missing values in data analysis pipelines is multiple imputation via software packages such as MICE (Van Buuren and Groothuis-Oudshoorn, 2011) and Amelia (Honaker et al., 2011). These packages typically assume…

Methodology · Statistics 2025-07-23 Trung Phung , Kyle Reese , Ilya Shpitser , Rohit Bhattacharya

Time-series analysis is often affected by missing data, a common problem across several fields, including healthcare and environmental monitoring. Multiple Imputation by Chained Equations (MICE) has been prominent for imputing missing…

Machine Learning · Statistics 2026-04-10 Amuche Ibenegbu , Pierre Lafaye de Micheaux , Rohitash Chandra

Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps…

Missing values or data is one popular characteristic of real-world datasets, especially healthcare data. This could be frustrating when using machine learning algorithms on such datasets, simply because most machine learning models perform…

Machine Learning · Computer Science 2024-03-25 Luke Oluwaseye Joel , Wesley Doorsamy , Babu Sena Paul

Missing data represents a fundamental challenge in machine learning applications, often reducing model performance and reliability. This problem is particularly acute in fields like bioinformatics and clinical machine learning, where…

Machine Learning · Computer Science 2025-09-04 Fatemeh Azad , Zoran Bosnić , Matjaž Kukar

Missing data is a fundamental challenge in data science, significantly hindering analysis and decision-making across a wide range of disciplines, including healthcare, bioinformatics, social science, e-commerce, and industrial monitoring.…

Machine Learning · Statistics 2026-05-12 Jicong Fan

Clinical decision support using data mining techniques offers more intelligent way to reduce the decision error in the last few years. However, clinical datasets often suffer from high missingness, which adversely impacts the quality of…

Machine Learning · Computer Science 2020-11-20 Xuetong Wu , Hadi Akbarzadeh Khorshidi , Uwe Aickelin , Zobaida Edib , Michelle Peate

Missing values pose a persistent challenge in modern data science. Consequently, there is an ever-growing number of publications introducing new imputation methods in various fields. While many studies compare imputation approaches, they…

Computation · Statistics 2025-11-10 Krystyna Grzesiak , Christophe Muller , Julie Josse , Jeffrey Näf

Missing data present challenges in data analysis. Naive analyses such as complete-case and available-case analysis may introduce bias and loss of efficiency, and produce unreliable results. Multiple imputation (MI) is one of the most widely…

Methodology · Statistics 2019-05-15 Domonique W. Hodge , Sandra E. Safo , Qi Long

When data have a hierarchical structure, such as students nested within classrooms, ignoring dependencies between observations can compromise the validity of imputation procedures. Standard tree-based imputation methods implicitly assume…

Applications · Statistics 2025-03-21 Nico Föge , Jakob Schwerter , Ketevan Gurtskaia , Markus Pauly , Philipp Doebler

Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the…

Methodology · Statistics 2022-06-27 Rose Sisk , Matthew Sperrin , Niels Peek , Maarten van Smeden , Glen P. Martin
‹ Prev 1 2 3 10 Next ›