English
Related papers

Related papers: Prediction Models That Learn to Avoid Missing Valu…

200 papers

In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here,…

Machine Learning · Statistics 2024-03-22 Julie Josse , Jacob M. Chen , Nicolas Prost , Erwan Scornet , Gaël Varoquaux

We investigate the fairness concerns of training a machine learning model using data with missing values. Even though there are a number of fairness intervention methods in the literature, most of them require a complete training set as…

Machine Learning · Computer Science 2022-04-15 Haewon Jeong , Hao Wang , Flavio P. Calmon

Machine learning (ML) has become a ubiquitous tool across various domains of data mining and big data analysis. The efficacy of ML models depends heavily on high-quality datasets, which are often complicated by the presence of missing…

Machine Learning · Computer Science 2024-10-14 Abu Fuad Ahmad , Md Shohel Sayeed , Khaznah Alshammari , Istiaque Ahmed

Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine…

Machine Learning · Computer Science 2020-07-01 Pasha Khosravi , Antonio Vergari , YooJung Choi , Yitao Liang , Guy Van den Broeck

Decision trees are widely used for interpretable machine learning due to their clearly structured reasoning process. However, this structure belies a challenge we refer to as predictive equivalence: a given tree's decision boundary can be…

Machine Learning · Computer Science 2025-10-15 Hayden McTavish , Zachery Boner , Jon Donnelly , Margo Seltzer , Cynthia Rudin

While data are the primary fuel for machine learning models, they often suffer from missing values, especially when collected in real-world scenarios. However, many off-the-shelf machine learning models, including artificial neural network…

Conditions ensuring optimal parameter estimation in the presence of missing data are well established in inference, typically relying on the Missing-at-Random (MAR) assumption. In prediction, similar principles are often assumed to apply.…

Methodology · Statistics 2026-03-19 Pierre Catoire , Robin Genuer , Cecile Proust-Lima

Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time. When variables are missing in recurring patterns, fitting separate pattern submodels have been proposed as…

Machine Learning · Computer Science 2023-11-27 Lena Stempfle , Ashkan Panahi , Fredrik D. Johansson

BACKGROUND: As databases grow larger, it becomes harder to fully control their collection, and they frequently come with missing values: incomplete observations. These large databases are well suited to train machine-learning models, for…

Machine Learning · Computer Science 2022-02-23 Alexandre Perez-Lebel , Gaël Varoquaux , Marine Le Morvan , Julie Josse , Jean-Baptiste Poline

We present a method for incorporating missing data in non-parametric statistical learning without the need for imputation. We focus on a tree-based method, Bayesian Additive Regression Trees (BART), enhanced with "Missingness Incorporated…

Machine Learning · Statistics 2014-02-14 Adam Kapelner , Justin Bleich

We characterize the structure and origins of missingness for 159 cross-sectional return predictors and study missing value handling for portfolios constructed using machine learning. Simply imputing with cross-sectional means performs well…

Methodology · Statistics 2024-01-15 Andrew Y. Chen , Jack McCoy

Training datasets for machine learning often have some form of missingness. For example, to learn a model for deciding whom to give a loan, the available training data includes individuals who were given a loan in the past, but not those…

Machine Learning · Computer Science 2020-12-22 Naman Goel , Alfonso Amayuelas , Amit Deshpande , Amit Sharma

Missing data are an unavoidable complication in many machine learning tasks. When data are `missing at random' there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious,…

Multivariate time series (MTS) prediction is ubiquitous in real-world fields, but MTS data often contains missing values. In recent years, there has been an increasing interest in using end-to-end models to handle MTS with missing values.…

Machine Learning · Computer Science 2023-05-11 Zhao-Yu Zhang , Shao-Qun Zhang , Yuan Jiang , Zhi-Hua Zhou

The main contribution of this paper is the development of a new decision tree algorithm. The proposed approach allows users to guide the algorithm through the data partitioning process. We believe this feature has many applications but in…

Machine Learning · Statistics 2020-10-27 Cédric Beaulac , Jeffrey S. Rosenthal

Forming a reliable judgement of a machine learning (ML) model's appropriateness for an application ecosystem is critical for its responsible use, and requires considering a broad range of factors including harms, benefits, and…

Machine Learning · Computer Science 2022-05-12 Ben Hutchinson , Negar Rostamzadeh , Christina Greer , Katherine Heller , Vinodkumar Prabhakaran

The ubiquity of missing data in urban intelligence systems, attributable to adverse environmental conditions and equipment failures, poses a significant challenge to the efficacy of downstream applications, notably in the realms of traffic…

Machine Learning · Computer Science 2026-05-25 Songyu Ke , Chenyu Wu , Yuxuan Liang , Huiling Qin , Junbo Zhang , Yu Zheng

Machine learning methods adapt the parameters of a model, constrained to lie in a given model class, by using a fixed learning procedure based on data or active observations. Adaptation is done on a per-task basis, and retraining is needed…

Machine Learning · Computer Science 2021-10-22 Osvaldo Simeone , Sangwoo Park , Joonhyuk Kang

Learning models that can handle distribution shifts is a key challenge in domain generalization. Invariance learning, an approach that focuses on identifying features invariant across environments, improves model generalization by capturing…

Machine Learning · Statistics 2026-05-11 Yiran Jia , Jelena Bradic

Rule models are often preferred in prediction tasks with tabular inputs as they can be easily interpreted using natural language and provide predictive performance on par with more complex models. However, most rule models' predictions are…

Machine Learning · Computer Science 2023-11-27 Lena Stempfle , Fredrik D. Johansson
‹ Prev 1 2 3 10 Next ›