Related papers: Permutation Decision Trees
Decision trees are popular classification models, providing high accuracy and intuitive explanations. However, as the tree size grows the model interpretability deteriorates. Traditional tree-induction algorithms, such as C4.5 and CART,…
We introduce the Integrated Tsallis Combination (ITC), a hybrid impurity measure for decision tree learning that combines normalized Tsallis entropy with an exponential polarization component. While many existing measures sacrifice…
Decision trees and their ensembles are popular in machine learning as easy-to-understand models. Several techniques have been proposed in the literature for learning tree-based classifiers, with different techniques working well for data…
Decision trees and random forest remain highly competitive for classification on medium-sized, standard datasets due to their robustness, minimal preprocessing requirements, and interpretability. However, a single tree suffers from high…
Customer retention campaigns increasingly rely on predictive models to detect potential churners in a vast customer base. From the perspective of machine learning, the task of predicting customer churn can be presented as a binary…
Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For binary classification and regression models, this approach recursively divides the data into two near-homogenous…
Several studies have shown that combining machine learning models in an appropriate way will introduce improvements in the individual predictions made by the base models. The key to make well-performing ensemble model is in the diversity of…
Decision trees are a commonly used class of machine learning models valued for their interpretability and versatility, capable of both classification and regression. We propose ZTree, a novel decision tree learning framework that replaces…
Permutation entropy quantifies the diversity of possible orderings of the values a random or deterministic system can take, as Shannon entropy quantifies the diversity of values. We show that the metric and permutation entropy…
The prevailing mindset is that a single decision tree underperforms classic random forests in testing accuracy, despite its advantages in interpretability and lightweight structure. This study challenges such a mindset by significantly…
Measuring the complexity of tree structures can be beneficial in areas that use tree data structures for storage, communication, and processing purposes. This complexity can then be used to compress tree data structures to their…
How to obtain a model with good interpretability and performance has always been an important research topic. In this paper, we propose rectified decision trees (ReDT), a knowledge distillation based decision trees rectification with high…
Computers are deterministic dynamical systems (CHAOS 19:033124, 2009). Among other things, that implies that one should be able to use deterministic forecast rules to predict their behavior. That statement is sometimes-but not always-true.…
Decision trees are widely used for non-linear modeling, as they capture interactions between predictors while producing inherently interpretable models. Despite their popularity, performing inference on the non-linear fit remains largely…
We address the problem of summarizing embedded tree patterns extracted from large data trees. We do so by defining and mining closed and maximal embedded unordered tree patterns from a single large data tree. We design an embedded frequent…
Decision trees have been widely used as classifiers in many machine learning applications thanks to their lightweight and interpretable decision process. This paper introduces Tree in Tree decision graph (TnT), a framework that extends the…
Multi-party training frameworks for decision trees based on secure multi-party computation enable multiple parties to train high-performance models on distributed private data with privacy preservation. The training process essentially…
Fine-tuning pretrained contextual word embedding models to supervised downstream tasks has become commonplace in natural language processing. This process, however, is often brittle: even with the same hyperparameter values, distinct random…
The rise of machine learning methods on heavily resource constrained devices requires not only the choice of a suitable model architecture for the target platform, but also the optimization of the chosen model with regard to execution time…
Context: Mutation Testing (MT) is an important tool in traditional Software Engineering (SE) white-box testing. It aims to artificially inject faults in a system to evaluate a test suite's capability to detect them, assuming that the test…