Related papers: Classification Trees for Imbalanced and Sparse Dat…

Precision-Recall Curve (PRC) Classification Trees

The classification of imbalanced data has presented a significant challenge for most well-known classification algorithms that were often designed for data with relatively balanced class distributions. Nevertheless skewed class distribution…

Machine Learning · Statistics 2023-04-21 Jiaju Miao , Wei Zhu

Soft regression trees: a model variant and a decomposition training algorithm

Decision trees are widely used for classification and regression tasks in a variety of application fields due to their interpretability and good accuracy. During the past decade, growing attention has been devoted to globally optimized…

Machine Learning · Computer Science 2025-01-28 Antonio Consolo , Edoardo Amaldi , Andrea Manno

TREE: Tree Regularization for Efficient Execution

The rise of machine learning methods on heavily resource constrained devices requires not only the choice of a suitable model architecture for the target platform, but also the optimization of the chosen model with regard to execution time…

Machine Learning · Computer Science 2024-06-19 Lena Schmid , Daniel Biebert , Christian Hakert , Kuan-Hsun Chen , Michel Lang , Markus Pauly , Jian-Jia Chen

Optimal Classification Trees for Continuous Feature Data Using Dynamic Programming with Branch-and-Bound

Computing an optimal classification tree that provably maximizes training performance within a given size limit, is NP-hard, and in practice, most state-of-the-art methods do not scale beyond computing optimal trees of depth three.…

Machine Learning · Computer Science 2025-01-15 Catalin E. Brita , Jacobus G. M. van der Linden , Emir Demirović

Efficient Decision Trees for Multi-class Support Vector Machines Using Entropy and Generalization Error Estimation

We propose new methods for Support Vector Machines (SVMs) using tree architecture for multi-class classi- fication. In each node of the tree, we select an appropriate binary classifier using entropy and generalization error estimation, then…

Machine Learning · Computer Science 2017-08-29 Pittipol Kantavat , Boonserm Kijsirikul , Patoomsiri Songsiri , Ken-ichi Fukui , Masayuki Numao

Learning to Branch

Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial and nonconvex problems. For example, they are the foremost method for solving (mixed) integer programs and constraint satisfaction…

Artificial Intelligence · Computer Science 2018-05-18 Maria-Florina Balcan , Travis Dick , Tuomas Sandholm , Ellen Vitercik

Trust-Region Stochastic Optimization with Variance Reduction Technique

We propose a novel algorithm, TR-SVR, for solving unconstrained stochastic optimization problems. This method builds on the trust-region framework, which effectively balances local and global exploration in optimization tasks. TR-SVR…

Optimization and Control · Mathematics 2024-12-03 Xinshou Zheng

Generalized and Scalable Optimal Sparse Decision Trees

Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been…

Machine Learning · Computer Science 2022-11-24 Jimmy Lin , Chudi Zhong , Diane Hu , Cynthia Rudin , Margo Seltzer

Spectral Algorithms for Computing Fair Support Vector Machines

Classifiers and rating scores are prone to implicitly codifying biases, which may be present in the training data, against protected classes (i.e., age, gender, or race). So it is important to understand how to design classifiers and scores…

Machine Learning · Computer Science 2017-10-17 Matt Olfat , Anil Aswani

Improving the precision of classification trees

Besides serving as prediction models, classification trees are useful for finding important predictor variables and identifying interesting subgroups in the data. These functions can be compromised by weak split selection algorithms that…

Applications · Statistics 2010-11-03 Wei-Yin Loh

Beyond Trees: Classification with Sparse Pairwise Dependencies

Several classification methods assume that the underlying distributions follow tree-structured graphical models. Indeed, trees capture statistical dependencies between pairs of variables, which may be crucial to attain low classification…

Machine Learning · Statistics 2021-05-31 Yaniv Tenzer , Amit Moscovich , Mary Frances Dorn , Boaz Nadler , Clifford Spiegelman

A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds

Decision trees are important both as interpretable models amenable to high-stakes decision-making, and as building blocks of ensemble methods such as random forests and gradient boosting. Their statistical properties, however, are not well…

Machine Learning · Statistics 2021-10-20 Yan Shuo Tan , Abhineet Agarwal , Bin Yu

On multivariate randomized classification trees: $l_0$-based sparsity, VC~dimension and decomposition methods

Decision trees are widely-used classification and regression models because of their interpretability and good accuracy. Classical methods such as CART are based on greedy approaches but a growing attention has recently been devoted to…

Machine Learning · Computer Science 2021-12-16 Edoardo Amaldi , Antonio Consolo , Andrea Manno

Multiclass Optimal Classification Trees with SVM-splits

In this paper we present a novel mathematical optimization-based methodology to construct tree-shaped classification rules for multiclass instances. Our approach consists of building Classification Trees in which, except for the leaf nodes,…

Optimization and Control · Mathematics 2021-11-17 Víctor Blanco , Alberto Japón , Justo Puerto

Sparsity in Optimal Randomized Classification Trees

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity…

Optimization and Control · Mathematics 2020-02-24 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales

Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design

Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the…

Machine Learning · Computer Science 2022-10-27 Ali Behrouz , Mathias Lecuyer , Cynthia Rudin , Margo Seltzer

Sparse residual tree and forest

Sparse residual tree (SRT) is an adaptive exploration method for multivariate scattered data approximation. It leads to sparse and stable approximations in areas where the data is sufficient or redundant, and points out the possible local…

Numerical Analysis · Mathematics 2019-05-15 Xin Xu , Xiaopeng Luo

Random Forest Variable Importance-based Selection Algorithm in Class Imbalance Problem

Random Forest is a machine learning method that offers many advantages, including the ability to easily measure variable importance. Class balancing technique is a well-known solution to deal with class imbalance problem. However, it has…

Machine Learning · Statistics 2023-12-19 Yunbi Nam , Sunwoo Han

Sequential Targeting: an incremental learning approach for data imbalance in text classification

Classification tasks require a balanced distribution of data to ensure the learner to be trained to generalize over all classes. In real-world datasets, however, the number of instances vary substantially among classes. This typically leads…

Machine Learning · Computer Science 2020-11-24 Joel Jang , Yoonjeon Kim , Kyoungho Choi , Sungho Suh

Optimal randomized classification trees

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and…

Machine Learning · Statistics 2021-10-25 Rafael Blanquero , Emilio Carrizosa , Cristina Molero-Río , Dolores Romero Morales