English
Related papers

Related papers: A computational study on imputation methods for mi…

200 papers

Missing values or data is one popular characteristic of real-world datasets, especially healthcare data. This could be frustrating when using machine learning algorithms on such datasets, simply because most machine learning models perform…

Machine Learning · Computer Science 2024-03-25 Luke Oluwaseye Joel , Wesley Doorsamy , Babu Sena Paul

Modern data acquisition based on high-throughput technology is often facing the problem of missing data. Algorithms commonly used in the analysis of such large-scale data often depend on a complete set. Missing value imputation offers a…

Applications · Statistics 2014-06-03 Daniel J. Stekhoven , Peter Bühlmann

Random forest (RF) missing data algorithms are an attractive approach for dealing with missing data. They have the desirable properties of being able to handle mixed types of missing data, they are adaptive to interactions and nonlinearity,…

Machine Learning · Statistics 2017-01-23 Fei Tang , Hemant Ishwaran

Missing data is a common problem which has consistently plagued statisticians and applied analytical researchers. While replacement methods like mean-based or hot deck imputation have been well researched, emerging imputation techniques…

Methodology · Statistics 2022-12-27 Seema Sangari , Herman E. Ray

Dealing with missing data is an important problem in statistical analysis that is often addressed with imputation procedures. The performance and validity of such methods are of great importance for their application in empirical studies.…

Applications · Statistics 2024-01-19 Jakob Schwerter , Ketevan Gurtskaia , Andrés Romero , Birgit Zeyer-Gliozzo , Markus Pauly

Multiple imputation (MI) is a popular approach for dealing with missing data arising from non-response in sample surveys. Multiple imputation by chained equations (MICE) is one of the most widely used MI algorithms for multivariate data,…

Machine Learning · Computer Science 2022-03-22 Zhenhua Wang , Olanrewaju Akande , Jason Poulos , Fan Li

Imputation of missing data is a common application in various classification problems where the feature training matrix has missingness. A widely used solution to this imputation problem is based on the lazy learning technique, $k$-nearest…

Machine Learning · Statistics 2020-02-26 Arkopal Choudhury , Michael R. Kosorok

Missing data remains a very common problem in large datasets, including survey and census data containing many ordinal responses, such as political polls and opinion surveys. Multiple imputation (MI) is usually the go-to approach for…

Methodology · Statistics 2024-12-25 Chayut Wongkamthong , Olanrewaju Akande

Missing data imputation is an important research topic in data mining. Large-scale Molecular descriptor data may contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a…

Computational Engineering, Finance, and Science · Computer Science 2013-12-13 Doreswamy , Chanabasayya . M. Vastrad

Healthcare time series data is vital for monitoring patient activity but often contains noise and missing values due to various reasons such as sensor errors or data interruptions. Imputation, i.e., filling in the missing values, is a…

Machine Learning · Computer Science 2024-12-17 Lien P. Le , Xuan-Hien Nguyen Thi , Thu Nguyen , Michael A. Riegler , Pål Halvorsen , Binh T. Nguyen

Missing data is a widespread problem in many domains, creating challenges in data analysis and decision making. Traditional techniques for dealing with missing data, such as excluding incomplete records or imputing simple estimates (e.g.,…

Databases · Computer Science 2024-01-09 Massimo Perini , Milos Nikolic

Missing data is an expected issue when large amounts of data is collected, and several imputation techniques have been proposed to tackle this problem. Beneath classical approaches such as MICE, the application of Machine Learning…

Machine Learning · Statistics 2017-12-01 Burim Ramosaj , Markus Pauly

Missing data are ubiquitous in empirical databases, yet statistical analyses typically require complete data matrices. Multiple imputation offers a principled solution for filling these gaps. This study evaluates the performance of several…

Computation · Statistics 2026-02-05 Enzo Porto Brasil

Prediction models are used to predict an outcome based on input variables. Missing data in input variables often occurs at model development and at prediction time. The missForestPredict R package proposes an adaptation of the missForest…

Methodology · Statistics 2024-07-08 Elena Albu , Shan Gao , Laure Wynants , Ben Van Calster

Missing data are inevitable in longitudinal studies. Traditional methods, such as the full information maximum likelihood (FIML), are commonly used to handle ignorable missing data. However, they may lead to biased model estimation due to…

Applications · Statistics 2024-01-01 Dandan Tang , Xin Tong

Machine learning iterative imputation methods have been well accepted by researchers for imputing missing data, but they can be time-consuming when handling large datasets. To overcome this drawback, parallel computing strategies have been…

Applications · Statistics 2020-04-24 Shangzhi Hong , Yuqi Sun , Hanying Li , Henry S. Lynn

This paper presents an impact assessment for the imputation of missing data. The data set used is HIV Seroprevalence data from an antenatal clinic study survey performed in 2001. Data imputation is performed through five methods: Random…

Methodology · Statistics 2020-11-25 Adam Pantanowitz , Tshilidzi Marwala

Missing values in tabular data restrict the use and performance of machine learning, requiring the imputation of missing values. The most popular imputation algorithm is arguably multiple imputations using chains of equations (MICE), which…

Machine Learning · Computer Science 2022-03-01 Manar D Samad , Sakib Abrar , Norou Diawara

Gaussian Mixture models (GMMs) are a powerful tool for clustering, classification and density estimation when clustering structures are embedded in the data. The presence of missing values can largely impact the GMMs estimation process,…

Machine Learning · Statistics 2020-06-05 Alessio Serafini , Thomas Brendan Murphy , Luca Scrucca

Missing data is a common concern in health datasets, and its impact on good decision-making processes is well documented. Our study's contribution is a methodology for tackling missing data problems using a combination of synthetic dataset…

Machine Learning · Computer Science 2022-11-08 Gift Khangamwa , Terence L. van Zyl , Clint J. van Alten
‹ Prev 1 2 3 10 Next ›