Related papers: Decision trees are PAC-learnable from most product…
We show how any PAC learning algorithm that works under the uniform distribution can be transformed, in a blackbox fashion, into one that works under an arbitrary and unknown distribution $\mathcal{D}$. The efficiency of our transformation…
We prove that it is NP-hard to properly PAC learn decision trees with queries, resolving a longstanding open problem in learning theory (Bshouty 1993; Guijarro-Lavin-Raghavan 1999; Mehta-Raghavan 2002; Feldman 2016). While there has been a…
We show that the class of strongly connected graphical models with treewidth at most k can be properly efficiently PAC-learnt with respect to the Kullback-Leibler Divergence. Previous approaches to this problem, such as those of Chow ([1]),…
Decision trees and their ensembles are very popular models of supervised machine learning. In this paper we merge the ideas underlying decision trees, their ensembles and FCA by proposing a new supervised machine learning model which can be…
Decision trees are popular machine learning models that are simple to build and easy to interpret. Even though algorithms to learn decision trees date back to almost 50 years, key properties affecting their generalization error are still…
When using machine learning for imbalanced binary classification problems, it is common to subsample the majority class to create a (more) balanced training dataset. This biases the model's predictions because the model learns from data…
Existing guarantees in terms of rigorous upper bounds on the generalization error for the original random forest algorithm, one of the most frequently used machine learning methods, are unsatisfying. We discuss and evaluate various…
We study how to utilize (possibly machine-learned) predictions in a model for computing under uncertainty in which an algorithm can query unknown data. The goal is to minimize the number of queries needed to solve the problem. We consider…
Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…
Eliciting preferences from human judgements is inherently imprecise, yet most decision analysis methods force a single priority vector from pairwise comparisons, discarding the information embedded in inconsistencies. We instead leverage…
Decision trees and their ensembles are popular in machine learning as easy-to-understand models. Several techniques have been proposed in the literature for learning tree-based classifiers, with different techniques working well for data…
How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its "learning curve", that is, the decay of the error rate as a function of…
We give new differentially private algorithms for the classic problems of learning decision lists and large-margin halfspaces in the PAC and online models. In the PAC model, we give a computationally efficient algorithm for learning…
Decision trees are a popular machine learning method, known for their inherent explainability. In Explainable AI, decision trees can be used as surrogate models for complex black box AI models or as approximations of parts of such models. A…
The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both…
Using Machine Learning systems in the real world can often be problematic, with inexplicable black-box models, the assumed certainty of imperfect measurements, or providing a single classification instead of a probability distribution. This…
Noise-tolerant PAC learning of linear models has been of central interests in machine learning community since the last century. In recent years, many computationally-efficient algorithms have been proposed for the problem of learning…
We introduce the problem of learning mixtures of $k$ subcubes over $\{0,1\}^n$, which contains many classic learning theory problems as a special case (and is itself a special case of others). We give a surprising $n^{O(\log k)}$-time…
Binary classification from positive-only samples is a variant of PAC learning where the learner receives i.i.d. positive samples and aims to learn a classifier with low error. Previous work by Natarajan, Gereb-Graus, and Shvaytser…
We consider supervised learning with random decision trees, where the tree construction is completely random. The method is popularly used and works well in practice despite the simplicity of the setting, but its statistical mechanism is…