Related papers: BEST : A decision tree algorithm that handles miss…

Handling Missing Data in Decision Trees: A Probabilistic Approach

Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine…

Machine Learning · Computer Science 2020-07-01 Pasha Khosravi , Antonio Vergari , YooJung Choi , Yitao Liang , Guy Van den Broeck

Performance comparison of State-of-the-art Missing Value Imputation Algorithms on Some Bench mark Datasets

Decision making from data involves identifying a set of attributes that contribute to effective decision making through computational intelligence. The presence of missing values greatly influences the selection of right set of attributes…

Machine Learning · Computer Science 2013-07-23 M. Naresh Kumar

Fairness without Imputation: A Decision Tree Approach for Fair Prediction with Missing Values

We investigate the fairness concerns of training a machine learning model using data with missing values. Even though there are a number of fairness intervention methods in the literature, most of them require a complete training set as…

Machine Learning · Computer Science 2022-04-15 Haewon Jeong , Hao Wang , Flavio P. Calmon

Leveraging Predictive Equivalence in Decision Trees

Decision trees are widely used for interpretable machine learning due to their clearly structured reasoning process. However, this structure belies a challenge we refer to as predictive equivalence: a given tree's decision boundary can be…

Machine Learning · Computer Science 2025-10-15 Hayden McTavish , Zachery Boner , Jon Donnelly , Margo Seltzer , Cynthia Rudin

Regression with Missing Data, a Comparison Study of TechniquesBased on Random Forests

In this paper we present the practical benefits of a new random forest algorithm to deal withmissing values in the sample. The purpose of this work is to compare the different solutionsto deal with missing values with random forests and…

Statistics Theory · Mathematics 2021-10-19 Irving Gómez-Méndez , Emilien Joly

The influence of missing data mechanisms and simple missing data handling techniques on fairness

Machine learning algorithms permeate the day-to-day aspects of our lives and therefore studying the fairness of these algorithms before implementation is crucial. One way in which bias can manifest in a dataset is through missing values.…

Machine Learning · Statistics 2026-02-23 Aeysha Bhatti , Trudie Sandrock , Johane Nienkemper-Swanepoel

Algorithm for Missing Values Imputation in Categorical Data with Use of Association Rules

This paper presents algorithm for missing values imputation in categorical data. The algorithm is based on using association rules and is presented in three variants. Experimental shows better accuracy of missing values imputation using the…

Machine Learning · Computer Science 2012-11-09 Jiří Kaiser

Big Data Classification Using Augmented Decision Trees

We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble…

Machine Learning · Statistics 2017-10-27 Rajiv Sambasivan , Sourish Das

Trinary Decision Trees for handling missing data

This paper introduces the Trinary decision tree, an algorithm designed to improve the handling of missing data in decision tree regressors and classifiers. Unlike other approaches, the Trinary decision tree does not assume that missing…

Machine Learning · Statistics 2024-01-12 Henning Zakrisson

A New Method for Classification of Datasets for Data Mining

Decision tree is an important method for both induction research and data mining, which is mainly used for model classification and prediction. ID3 algorithm is the most widely used algorithm in the decision tree so far. In this paper, the…

Machine Learning · Computer Science 2016-12-02 Singh Vijendra , Hemjyotsana Parashar , Nisha Vasudeva

Batched Lazy Decision Trees

We introduce a batched lazy algorithm for supervised classification using decision trees. It avoids unnecessary visits to irrelevant nodes when it is used to make predictions with either eagerly or lazily trained decision trees. A set of…

Machine Learning · Computer Science 2016-03-09 Mathieu Guillame-Bert , Artur Dubrawski

No imputation without representation

By filling in missing values in datasets, imputation allows these datasets to be used with algorithms that cannot handle missing values by themselves. However, missing values may in principle contribute useful information that is lost…

Machine Learning · Computer Science 2024-10-31 Oliver Urs Lenz , Daniel Peralta , Chris Cornelis

Evaluating Nonlinear Decision Trees for Binary Classification Tasks with Other Existing Methods

Classification of datasets into two or more distinct classes is an important machine learning task. Many methods are able to classify binary classification tasks with a very high accuracy on test data, but cannot provide any easily…

Machine Learning · Computer Science 2020-08-26 Yashesh Dhebar , Sparsh Gupta , Kalyanmoy Deb

An Experimental Comparison of Old and New Decision Tree Algorithms

This paper presents a detailed comparison of a recently proposed algorithm for optimizing decision trees, tree alternating optimization (TAO), with other popular, established algorithms. We compare their performance on a number of…

Machine Learning · Computer Science 2020-03-23 Arman Zharmagambetov , Suryabhan Singh Hada , Miguel Á. Carreira-Perpiñán , Magzhan Gabidolla

A novel gradient-based method for decision trees optimizing arbitrary differential loss functions

There are many approaches for training decision trees. This work introduces a novel gradient-based method for constructing decision trees that optimize arbitrary differentiable loss functions, overcoming the limitations of heuristic…

Machine Learning · Computer Science 2025-03-25 Andrei V. Konstantinov , Lev V. Utkin

MurTree: Optimal Classification Trees via Dynamic Programming and Search

Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy.…

Machine Learning · Computer Science 2022-06-30 Emir Demirović , Anna Lukina , Emmanuel Hebrard , Jeffrey Chan , James Bailey , Christopher Leckie , Kotagiri Ramamohanarao , Peter J. Stuckey

Improving Missing Data Imputation with Deep Generative Models

Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative…

Machine Learning · Computer Science 2019-02-28 Ramiro D. Camino , Christian A. Hammerschmidt , Radu State

Decision Trees for Function Evaluation - Simultaneous Optimization of Worst and Expected Cost

In several applications of automatic diagnosis and active learning a central problem is the evaluation of a discrete function by adaptively querying the values of its variables until the values read uniquely determine the value of the…

Data Structures and Algorithms · Computer Science 2014-07-29 Ferdinando Cicalese , Eduardo Laber , Aline Medeiros Saettler

Decision Stream: Cultivating Deep Decision Trees

Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Tree node splitting based on relevant feature selection is a key step of decision tree learning, at…

Machine Learning · Computer Science 2017-09-05 Dmitry Ignatov , Andrey Ignatov

Uncertainty-Aware Fairness-Adaptive Classification Trees

In an era where artificial intelligence and machine learning algorithms increasingly impact human life, it is crucial to develop models that account for potential discrimination in their predictions. This paper tackles this problem by…

Machine Learning · Statistics 2024-10-10 Anna Gottard , Vanessa Verrina , Sabrina Giordano