Related papers: Mixed-Integer Linear Optimization for Semi-Supervi…

Mixed-Integer Linear Optimization for Cardinality-Constrained Random Forests

Random forests are among the most famous algorithms for solving classification problems, in particular for large-scale data sets. Considering a set of labeled points and several decision trees, the method takes the majority vote to classify…

Optimization and Control · Mathematics 2025-01-24 Jan Pablo Burgard , Maria Eduarda Pinheiro , Martin Schmidt

Generalized Optimal Classification Trees: A Mixed-Integer Programming Approach

Global optimization of decision trees is a long-standing challenge in combinatorial optimization, yet such models play an important role in interpretable machine learning. Although the problem has been investigated for several decades, only…

Machine Learning · Computer Science 2026-02-03 Jiancheng Tu , Wenqi Fan , Zhibin Wu

Optimal Mixed Integer Linear Optimization Trained Multivariate Classification Trees

Multivariate decision trees are powerful machine learning tools for classification and regression that attract many researchers and industry professionals. An optimal binary tree has two types of vertices, (i) branching vertices which have…

Machine Learning · Computer Science 2024-08-05 Brandon Alston , Illya V. Hicks

Multiclass Optimal Classification Trees with SVM-splits

In this paper we present a novel mathematical optimization-based methodology to construct tree-shaped classification rules for multiclass instances. Our approach consists of building Classification Trees in which, except for the leaf nodes,…

Optimization and Control · Mathematics 2021-11-17 Víctor Blanco , Alberto Japón , Justo Puerto

Mixed integer linear optimization formulations for learning optimal binary classification trees

Decision trees are powerful tools for classification and regression that attract many researchers working in the burgeoning area of machine learning. One advantage of decision trees over other methods is their interpretability, which is…

Machine Learning · Computer Science 2023-07-11 Brandon Alston , Hamidreza Validi , Illya V. Hicks

Strong Optimal Classification Trees

Decision trees are among the most popular machine learning models and are used routinely in applications ranging from revenue management and medicine to bioinformatics. In this paper, we consider the problem of learning optimal binary…

Machine Learning · Computer Science 2023-07-20 Sina Aghaei , Andrés Gómez , Phebe Vayanos

Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded

Decision trees usefully represent sparse, high dimensional and noisy data. Having learned a function from this data, we may want to thereafter integrate the function into a larger decision-making problem, e.g., for picking the best chemical…

Optimization and Control · Mathematics 2019-09-26 Miten Mistry , Dimitrios Letsios , Gerhard Krennrich , Robert M. Lee , Ruth Misener

Optimal Generalized Decision Trees via Integer Programming

Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if…

Machine Learning · Computer Science 2019-08-14 Oktay Gunluk , Jayant Kalagnanam , Minhan Li , Matt Menickelly , Katya Scheinberg

Robust Optimal Classification Trees under Noisy Labels

In this paper we propose a novel methodology to construct Optimal Classification Trees that takes into account that noisy labels may occur in the training sample. Our approach rests on two main elements: (1) the splitting rules for the…

Machine Learning · Computer Science 2020-12-17 Víctor Blanco , Alberto Japón , Justo Puerto

Tight Mixed-Integer Optimization Formulations for Prescriptive Trees

We focus on modeling the relationship between an input feature vector and the predicted outcome of a trained decision tree using mixed-integer optimization. This can be used in many practical applications where a decision tree or tree…

Optimization and Control · Mathematics 2025-05-20 Max Biggs , Georgia Perakis

Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification

Semi-supervised learning (SSL) is a common approach to learning predictive models using not only labeled examples, but also unlabeled examples. While SSL for the simple tasks of classification and regression has received a lot of attention…

Machine Learning · Computer Science 2024-04-02 Jurica Levatić , Michelangelo Ceci , Dragi Kocev , Sašo Džeroski

Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making

In recent years, automated data-driven decision-making systems have enjoyed a tremendous success in a variety of fields (e.g., to make product recommendations, or to guide the production of entertainment). More recently, these algorithms…

Machine Learning · Computer Science 2019-03-27 Sina Aghaei , Mohammad Javad Azizi , Phebe Vayanos

Semi-supervised Deep Learning for Image Classification with Distribution Mismatch: A Survey

Deep learning methodologies have been employed in several different fields, with an outstanding success in image recognition applications, such as material quality control, medical imaging, autonomous driving, etc. Deep learning models rely…

Computer Vision and Pattern Recognition · Computer Science 2022-03-11 Saul Calderon-Ramirez , Shengxiang Yang , David Elizondo

Optimal Decision Trees for Nonlinear Metrics

Nonlinear metrics, such as the F1-score, Matthews correlation coefficient, and Fowlkes-Mallows index, are often used to evaluate the performance of machine learning models, in particular, when facing imbalanced datasets that contain more…

Machine Learning · Computer Science 2022-06-30 Emir Demirović , Peter J. Stuckey

Mixed-Integer Quadratic Optimization and Iterative Clustering Techniques for Semi-Supervised Support Vector Machines

Among the most famous algorithms for solving classification problems are support vector machines (SVMs), which find a separating hyperplane for a set of labeled data points. In some applications, however, labels are only available for a…

Optimization and Control · Mathematics 2023-10-17 Jan Pablo Burgard , Maria Eduarda Pinheiro , Martin Schmidt

Experiments with Optimal Model Trees

Model trees provide an appealing way to perform interpretable machine learning for both classification and regression problems. In contrast to ``classic'' decision trees with constant values in their leaves, model trees can use linear…

Machine Learning · Computer Science 2026-03-11 Sabino Francesco Roselli , Eibe Frank

Optimally Combining Classifiers for Semi-Supervised Learning

This paper considers semi-supervised learning for tabular data. It is widely known that Xgboost based on tree model works well on the heterogeneous features while transductive support vector machine can exploit the low density separation…

Machine Learning · Computer Science 2020-06-09 Zhiguo Wang , Liusha Yang , Feng Yin , Ke Lin , Qingjiang Shi , Zhi-Quan Luo

Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce Discrimination

A growing specter in the rise of machine learning is whether the decisions made by machine learning models are fair. While research is already underway to formalize a machine-learning concept of fairness and to design frameworks for…

Machine Learning · Computer Science 2020-09-28 Tao Zhang , Tianqing Zhu , Jing Li , Mengde Han , Wanlei Zhou , Philip S. Yu

Classification Tree-based Active Learning: A Wrapper Approach

Supervised machine learning often requires large training sets to train accurate models, yet obtaining large amounts of labeled data is not always feasible. Hence, it becomes crucial to explore active learning methods for reducing the size…

Machine Learning · Computer Science 2024-04-16 Ashna Jose , Emilie Devijver , Massih-Reza Amini , Noel Jakse , Roberta Poloni

Leveraging Structure for Improved Classification of Grouped Biased Data

We consider semi-supervised binary classification for applications in which data points are naturally grouped (e.g., survey responses grouped by state) and the labeled data is biased (e.g., survey respondents are not representative of the…

Machine Learning · Statistics 2022-12-08 Daniel Zeiberg , Shantanu Jain , Predrag Radivojac