English
Related papers

Related papers: Does imputation matter? Benchmark for predictive m…

200 papers

Missing values are prevalent across various fields, posing challenges for training and deploying predictive models. In this context, imputation is a common practice, driven by the hope that accurate imputations will enhance predictions.…

Artificial Intelligence · Computer Science 2025-02-21 Marine Le Morvan , Gaël Varoquaux

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods,…

Decision making from data involves identifying a set of attributes that contribute to effective decision making through computational intelligence. The presence of missing values greatly influences the selection of right set of attributes…

Machine Learning · Computer Science 2013-07-23 M. Naresh Kumar

Missing values pose a persistent challenge in modern data science. Consequently, there is an ever-growing number of publications introducing new imputation methods in various fields. While many studies compare imputation approaches, they…

Computation · Statistics 2025-11-10 Krystyna Grzesiak , Christophe Muller , Julie Josse , Jeffrey Näf

Recent advances in probabilistic modelling have led to a large number of simulation-based inference algorithms which do not require numerical evaluation of likelihoods. However, a public benchmark with appropriate performance metrics for…

Machine Learning · Statistics 2021-04-12 Jan-Matthis Lueckmann , Jan Boelts , David S. Greenberg , Pedro J. Gonçalves , Jakob H. Macke

Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse,…

Methodology · Statistics 2022-08-23 Mehdi Dagdoug , Camelia Goga , David Haziza

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

By filling in missing values in datasets, imputation allows these datasets to be used with algorithms that cannot handle missing values by themselves. However, missing values may in principle contribute useful information that is lost…

Machine Learning · Computer Science 2024-10-31 Oliver Urs Lenz , Daniel Peralta , Chris Cornelis

Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for…

Machine Learning · Statistics 2022-03-23 Burim Ramosaj , Justus Tulowietzki , Markus Pauly

Imputation is an attractive tool for dealing with the widespread issue of missing values. Consequently, studying and developing imputation methods has been an active field of research over the last decade. Faced with an imputation task and…

Methodology · Statistics 2025-07-16 Jeffrey Näf , Krystyna Grzesiak , Erwan Scornet

Clinical researchers often select among and evaluate risk prediction models using standard machine learning metrics based on confusion matrices. However, if these models are used to allocate interventions to patients, standard metrics…

Machine Learning · Statistics 2020-06-03 Alejandro Schuler , Aashish Bhardwaj , Vincent Liu

Many datasets suffer from missing values due to various reasons,which not only increases the processing difficulty of related tasks but also reduces the accuracy of classification. To address this problem, the mainstream approach is to use…

Machine Learning · Computer Science 2024-08-14 Cong Guo , Chun Liu , Wei Yang

BACKGROUND: As databases grow larger, it becomes harder to fully control their collection, and they frequently come with missing values: incomplete observations. These large databases are well suited to train machine-learning models, for…

Machine Learning · Computer Science 2022-02-23 Alexandre Perez-Lebel , Gaël Varoquaux , Marine Le Morvan , Julie Josse , Jean-Baptiste Poline

Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows…

Machine Learning · Computer Science 2025-09-30 Ruikai Yang , Fan He , Mingzhen He , Kaijie Wang , Xiaolin Huang

Real-world data is often incomplete and contains missing values. To train accurate models over real-world datasets, users need to spend a substantial amount of time and resources imputing and finding proper values for missing data items. In…

Machine Learning · Statistics 2024-03-05 Cheng Zhen , Nischal Aryal , Arash Termehchy , Alireza Aghasi , Amandeep Singh Chabada

Predictive benchmarking, the evaluation of machine learning models based on predictive performance and competitive ranking, is a central epistemic practice in machine learning research and an increasingly prominent method for scientific…

Machine Learning · Computer Science 2025-10-28 Timo Freiesleben , Sebastian Zezulka

The efficacy of machine learning (ML) models depends on both algorithms and data. Training data defines what we want our models to learn, and testing data provides the means by which their empirical progress is measured. Benchmark datasets…

Machine Learning · Computer Science 2021-11-23 Lora Aroyo , Matthew Lease , Praveen Paritosh , Mike Schaekermann

Large language model (LLM) evaluation is increasingly costly, prompting interest in methods that speed up evaluation by shrinking benchmark datasets. Benchmark prediction (also called efficient LLM evaluation) aims to select a small subset…

Machine Learning · Computer Science 2025-06-10 Guanhua Zhang , Florian E. Dorner , Moritz Hardt

Machine learning models are often used to inform real world risk assessment tasks: predicting consumer default risk, predicting whether a person suffers from a serious illness, or predicting a person's risk to appear in court. Given…

Machine Learning · Computer Science 2023-06-27 Jamelle Watson-Daniels , David C. Parkes , Berk Ustun

Causal inference analysis is the estimation of the effects of actions on outcomes. In the context of healthcare data this means estimating the outcome of counter-factual treatments (i.e. including treatments that were not observed) on a…

Methodology · Statistics 2018-03-21 Yishai Shimoni , Chen Yanover , Ehud Karavani , Yaara Goldschmnidt
‹ Prev 1 2 3 10 Next ›