Related papers: The Offset Tree for Learning with Partial Labels

Optimal randomized classification trees

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and…

Machine Learning · Statistics 2021-10-25 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Analyzing decision tree bias towards the minority class

There is a widespread and longstanding belief that machine learning models are biased towards the majority class when learning from imbalanced binary response data, leading them to neglect or ignore the minority class. Motivated by a recent…

Machine Learning · Statistics 2026-01-29 Nathan Phelps , Daniel J. Lizotte , Douglas G. Woolford

Offline Multi-Action Policy Learning: Generalization and Optimization

In many settings, a decision-maker wishes to learn a rule, or policy, that maps from observable characteristics of an individual to an action. Examples include selecting offers, prices, advertisements, or emails to send to consumers, as…

Machine Learning · Statistics 2018-11-20 Zhengyuan Zhou , Susan Athey , Stefan Wager

Evaluating Nonlinear Decision Trees for Binary Classification Tasks with Other Existing Methods

Classification of datasets into two or more distinct classes is an important machine learning task. Many methods are able to classify binary classification tasks with a very high accuracy on test data, but cannot provide any easily…

Machine Learning · Computer Science 2020-08-26 Yashesh Dhebar , Sparsh Gupta , Kalyanmoy Deb

Dive into Decision Trees and Forests: A Theoretical Demonstration

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…

Machine Learning · Computer Science 2021-01-22 Jinxiong Zhang

Sub-Setting Algorithm for Training Data Selection in Pattern Recognition

Modern pattern recognition tasks use complex algorithms that take advantage of large datasets to make more accurate predictions than traditional algorithms such as decision trees or k-nearest-neighbor better suited to describe simple…

Machine Learning · Statistics 2021-10-14 AGaurav Arwade , Sigurdur Olafsson

Mixed-Integer Linear Optimization for Cardinality-Constrained Random Forests

Random forests are among the most famous algorithms for solving classification problems, in particular for large-scale data sets. Considering a set of labeled points and several decision trees, the method takes the majority vote to classify…

Optimization and Control · Mathematics 2025-01-24 Jan Pablo Burgard , Maria Eduarda Pinheiro , Martin Schmidt

Mixed-Integer Linear Optimization for Semi-Supervised Optimal Classification Trees

Decision trees are one of the most popular methods for solving classification problems, mainly because of their good interpretability properties. Moreover, due to advances in recent years in mixed-integer optimization, several models have…

Optimization and Control · Mathematics 2026-05-29 Jan Pablo Burgard , Maria Eduarda Pinheiro , Martin Schmidt

Learning to Abstain from Binary Prediction

A binary classifier capable of abstaining from making a label prediction has two goals in tension: minimizing errors, and avoiding abstaining unnecessarily often. In this work, we exactly characterize the best achievable tradeoff between…

Machine Learning · Computer Science 2016-11-30 Akshay Balsubramani

Learning Decision Trees and Forests with Algorithmic Recourse

This paper proposes a new algorithm for learning accurate tree-based models while ensuring the existence of recourse actions. Algorithmic Recourse (AR) aims to provide a recourse action for altering the undesired prediction result given by…

Machine Learning · Computer Science 2024-06-04 Kentaro Kanamori , Takuya Takagi , Ken Kobayashi , Yuichi Ike

On Conditional Branches in Optimal Decision Trees

The decision tree is one of the most fundamental programming abstractions. A commonly used type of decision tree is the alphabetic binary tree, which uses (without loss of generality) ``less than'' versus ''greater than or equal to'' tests…

Performance · Computer Science 2007-07-13 Michael B. Baer

Partial-Label Learning with a Reject Option

In real-world applications, one often encounters ambiguously labeled data, where different annotators assign conflicting class labels. Partial-label learning allows training classifiers in this weakly supervised setting, where…

Machine Learning · Computer Science 2025-10-27 Tobias Fuchs , Florian Kalinke , Klemens Böhm

Indecision Trees: Learning Argument-Based Reasoning under Quantified Uncertainty

Using Machine Learning systems in the real world can often be problematic, with inexplicable black-box models, the assumed certainty of imperfect measurements, or providing a single classification instead of a probability distribution. This…

Machine Learning · Computer Science 2023-07-11 Jonathan S. Kent , David H. Menager

Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making

In recent years, automated data-driven decision-making systems have enjoyed a tremendous success in a variety of fields (e.g., to make product recommendations, or to guide the production of entertainment). More recently, these algorithms…

Machine Learning · Computer Science 2019-03-27 Sina Aghaei , Mohammad Javad Azizi , Phebe Vayanos

Best-scored Random Forest Classification

We propose an algorithm named best-scored random forest for binary classification problems. The terminology "best-scored" means to select the one with the best empirical performance out of a certain number of purely random tree candidates…

Machine Learning · Statistics 2019-05-28 Hanyuan Hang , Xiaoyu Liu , Ingo Steinwart

Challenges learning from imbalanced data using tree-based models: Prevalence estimates systematically depend on hyperparameters and can be upwardly biased

When using machine learning for imbalanced binary classification problems, it is common to subsample the majority class to create a (more) balanced training dataset. This biases the model's predictions because the model learns from data…

Machine Learning · Computer Science 2025-11-03 Nathan Phelps , Daniel J. Lizotte , Douglas G. Woolford

Learning the Truth From Only One Side of the Story

Learning under one-sided feedback (i.e., where we only observe the labels for examples we predicted positively on) is a fundamental problem in machine learning -- applications include lending and recommendation systems. Despite this, there…

Machine Learning · Computer Science 2020-10-14 Heinrich Jiang , Qijia Jiang , Aldo Pacchiano

The Conditioning Bias in Binary Decision Trees and Random Forests and Its Elimination

Decision tree and random forest classification and regression are some of the most widely used in machine learning approaches. Binary decision tree implementations commonly use conditioning in the form 'feature $\leq$ (or $<$) threshold',…

Machine Learning · Computer Science 2023-12-19 Gábor Timár , György Kovács

Optimal Sparse Decision Trees

Decision tree algorithms have been among the most popular algorithms for interpretable (transparent) machine learning since the early 1980's. The problem that has plagued decision tree algorithms since their inception is their lack of…

Machine Learning · Computer Science 2023-09-28 Xiyang Hu , Cynthia Rudin , Margo Seltzer

HHCART: An Oblique Decision Tree

Decision trees are a popular technique in statistical data classification. They recursively partition the feature space into disjoint sub-regions until each sub-region becomes homogeneous with respect to a particular class. The basic…

Machine Learning · Statistics 2015-04-15 D. C. Wickramarachchi , B. L. Robertson , M. Reale , C. J. Price , J. Brown