Related papers: Decision trees are PAC-learnable from most product…

Lifting uniform learners via distributional decomposition

We show how any PAC learning algorithm that works under the uniform distribution can be transformed, in a blackbox fashion, into one that works under an arbitrary and unknown distribution $\mathcal{D}$. The efficiency of our transformation…

Machine Learning · Statistics 2023-03-31 Guy Blanc , Jane Lange , Ali Malik , Li-Yang Tan

Properly Learning Decision Trees with Queries Is NP-Hard

We prove that it is NP-hard to properly PAC learn decision trees with queries, resolving a longstanding open problem in learning theory (Bshouty 1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While there has been a…

Computational Complexity · Computer Science 2023-07-11 Caleb Koch , Carmen Strassle , Li-Yang Tan

PAC-learning bounded tree-width Graphical Models

We show that the class of strongly connected graphical models with treewidth at most k can be properly efficiently PAC-learnt with respect to the Kullback-Leibler Divergence. Previous approaches to this problem, such as those of Chow ([1]),…

Machine Learning · Computer Science 2012-07-19 Mukund Narasimhan , Jeff A. Bilmes

Decision Concept Lattice vs. Decision Trees and Random Forests

Decision trees and their ensembles are very popular models of supervised machine learning. In this paper we merge the ideas underlying decision trees, their ensembles and FCA by proposing a new supervised machine learning model which can be…

Machine Learning · Computer Science 2021-06-02 Egor Dudyrev , Sergei O. Kuznetsov

Decision trees as partitioning machines to characterize their generalization properties

Decision trees are popular machine learning models that are simple to build and easy to interpret. Even though algorithms to learn decision trees date back to almost 50 years, key properties affecting their generalization error are still…

Machine Learning · Computer Science 2020-10-16 Jean-Samuel Leboeuf , Frédéric LeBlanc , Mario Marchand

Challenges learning from imbalanced data using tree-based models: Prevalence estimates systematically depend on hyperparameters and can be upwardly biased

When using machine learning for imbalanced binary classification problems, it is common to subsample the majority class to create a (more) balanced training dataset. This biases the model's predictions because the model learns from data…

Machine Learning · Computer Science 2025-11-03 Nathan Phelps , Daniel J. Lizotte , Douglas G. Woolford

On PAC-Bayesian Bounds for Random Forests

Existing guarantees in terms of rigorous upper bounds on the generalization error for the original random forest algorithm, one of the most frequently used machine learning methods, are unsatisfying. We discuss and evaluate various…

Machine Learning · Computer Science 2019-03-07 Stephan Sloth Lorenzen , Christian Igel , Yevgeny Seldin

Learning-Augmented Query Policies

We study how to utilize (possibly machine-learned) predictions in a model for computing under uncertainty in which an algorithm can query unknown data. The goal is to minimize the number of queries needed to solve the problem. We consider…

Data Structures and Algorithms · Computer Science 2021-11-09 Thomas Erlebach , Murilo S. de Lima , Nicole Megow , Jens Schlöter

Dive into Decision Trees and Forests: A Theoretical Demonstration

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…

Machine Learning · Computer Science 2021-01-22 Jinxiong Zhang

Preference Analysis Using Random Spanning Trees: A Stochastic Sampling Approach to Inconsistent Pairwise Comparisons

Eliciting preferences from human judgements is inherently imprecise, yet most decision analysis methods force a single priority vector from pairwise comparisons, discarding the information embedded in inconsistencies. We instead leverage…

General Economics · Economics 2026-02-27 Salvatore Greco , Sajid Siraj , Michele Lundy

Learning accurate and interpretable tree-based models

Decision trees and their ensembles are popular in machine learning as easy-to-understand models. Several techniques have been proposed in the literature for learning tree-based classifiers, with different techniques working well for data…

Machine Learning · Computer Science 2025-05-20 Maria-Florina Balcan , Dravyansh Sharma

A Theory of Universal Learning

How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its "learning curve", that is, the decay of the error rate as a function of…

Machine Learning · Computer Science 2020-11-10 Olivier Bousquet , Steve Hanneke , Shay Moran , Ramon van Handel , Amir Yehudayoff

Privately Learning Decision Lists and a Differentially Private Winnow

We give new differentially private algorithms for the classic problems of learning decision lists and large-margin halfspaces in the PAC and online models. In the PAC model, we give a computationally efficient algorithm for learning…

Machine Learning · Computer Science 2026-02-10 Mark Bun , William Fang

Extracting PAC Decision Trees from Black Box Binary Classifiers: The Gender Bias Case Study on BERT-based Language Models

Decision trees are a popular machine learning method, known for their inherent explainability. In Explainable AI, decision trees can be used as surrogate models for complex black box AI models or as approximations of parts of such models. A…

Artificial Intelligence · Computer Science 2025-10-08 Ana Ozaki , Roberto Confalonieri , Ricardo Guimarães , Anders Imenes

Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates

The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both…

Information Theory · Computer Science 2011-02-15 Vincent Y. F. Tan , Animashree Anandkumar , Alan S. Willsky

Indecision Trees: Learning Argument-Based Reasoning under Quantified Uncertainty

Using Machine Learning systems in the real world can often be problematic, with inexplicable black-box models, the assumed certainty of imperfect measurements, or providing a single classification instead of a probability distribution. This…

Machine Learning · Computer Science 2023-07-11 Jonathan S. Kent , David H. Menager

Efficient and Noise-Tolerant PAC Learning of Multiclass Linear Classifiers

Noise-tolerant PAC learning of linear models has been of central interests in machine learning community since the last century. In recent years, many computationally-efficient algorithms have been proposed for the problem of learning…

Machine Learning · Computer Science 2026-05-19 Rita Adhikari , Shiwei Zeng

Beyond the Low-Degree Algorithm: Mixtures of Subcubes and Their Applications

We introduce the problem of learning mixtures of $k$ subcubes over $\{0,1\}^n$, which contains many classic learning theory problems as a special case (and is itself a special case of others). We give a surprising $n^{O(\log k)}$-time…

Machine Learning · Computer Science 2019-02-20 Sitan Chen , Ankur Moitra

Smoothed Analysis of Learning from Positive Samples

Binary classification from positive-only samples is a variant of PAC learning where the learner receives i.i.d. positive samples and aims to learn a classifier with low error. Previous work by Natarajan, Gereb-Graus, and Shvaytser…

Machine Learning · Statistics 2026-05-13 Jane H. Lee , Anay Mehrotra , Manolis Zampetakis

Differentially- and non-differentially-private random decision trees

We consider supervised learning with random decision trees, where the tree construction is completely random. The method is popularly used and works well in practice despite the simplicity of the setting, but its statistical mechanism is…

Machine Learning · Computer Science 2015-02-06 Mariusz Bojarski , Anna Choromanska , Krzysztof Choromanski , Yann LeCun