Related papers: Finding Good Itemsets by Packing Data

Decomposable Families of Itemsets

The problem of selecting a small, yet high quality subset of patterns from a larger collection of itemsets has recently attracted lot of research. Here we discuss an approach to this problem using the notion of decomposable families of…

Machine Learning · Computer Science 2020-06-18 Nikolaj Tatti , Hannes Heikinheimo

Optimal Generalized Decision Trees via Integer Programming

Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if…

Machine Learning · Computer Science 2019-08-14 Oktay Gunluk , Jayant Kalagnanam , Minhan Li , Matt Menickelly , Katya Scheinberg

An Improved UP-Growth High Utility Itemset Mining

Efficient discovery of frequent itemsets in large datasets is a crucial task of data mining. In recent years, several approaches have been proposed for generating high utility patterns, they arise the problems of producing a large number of…

Databases · Computer Science 2012-12-04 B. Adinarayana Reddy , O. Srinivasa Rao , M. H. M. Krishna Prasad

Big Data Classification Using Augmented Decision Trees

We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble…

Machine Learning · Statistics 2017-10-27 Rajiv Sambasivan , Sourish Das

In Search of Trees: Decision-Tree Policy Synthesis for Black-Box Systems via Search

Decision trees, owing to their interpretability, are attractive as control policies for (dynamical) systems. Unfortunately, constructing, or synthesising, such policies is a challenging task. Previous approaches do so by imitating a…

Artificial Intelligence · Computer Science 2025-04-23 Emir Demirović , Christian Schilling , Anna Lukina

Discovery of Maximal Frequent Item Sets using Subset Creation

Data mining is the practice to search large amount of data to discover data patterns. Data mining uses mathematical algorithms to group the data and evaluate the future events. Association rule is a research area in the field of knowledge…

Databases · Computer Science 2013-02-08 Jnanamurthy H. K.

Explainable Models via Compression of Tree Ensembles

Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be one of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important…

Machine Learning · Computer Science 2022-06-17 Siwen Yan , Sriraam Natarajan , Saket Joshi , Roni Khardon , Prasad Tadepalli

On Computing Compression Trees for Data Collection in Sensor Networks

We address the problem of efficiently gathering correlated data from a wired or a wireless sensor network, with the aim of designing algorithms with provable optimality guarantees, and understanding how close we can get to the known…

Networking and Internet Architecture · Computer Science 2009-08-03 Jian Li , Amol Deshpande , Samir Khuller

Handling Missing Data in Decision Trees: A Probabilistic Approach

Decision trees are a popular family of models due to their attractive properties such as interpretability and ability to handle heterogeneous data. Concurrently, missing data is a prevalent occurrence that hinders performance of machine…

Machine Learning · Computer Science 2020-07-01 Pasha Khosravi , Antonio Vergari , YooJung Choi , Yitao Liang , Guy Van den Broeck

Dive into Decision Trees and Forests: A Theoretical Demonstration

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and…

Machine Learning · Computer Science 2021-01-22 Jinxiong Zhang

Minimally Infrequent Itemset Mining using Pattern-Growth Paradigm and Residual Trees

Itemset mining has been an active area of research due to its successful application in various data mining scenarios including finding association rules. Though most of the past work has been on finding frequent itemsets, infrequent…

Databases · Computer Science 2012-07-23 Ashish Gupta , Akshay Mittal , Arnab Bhattacharya

Model family selection for classification using Neural Decision Trees

Model selection consists in comparing several candidate models according to a metric to be optimized. The process often involves a grid search, or such, and cross-validation, which can be time consuming, as well as not providing much…

Machine Learning · Computer Science 2020-06-23 Anthea Mérida Montes de Oca , Argyris Kalogeratos , Mathilde Mougeot

Interactive Set Discovery

We study the problem of set discovery where given a few example tuples of a desired set, we want to find the set in a collection of sets. A challenge is that the example tuples may not uniquely identify a set, and a large number of…

Databases · Computer Science 2022-10-05 Arif Hasnat , Davood Rafiei

A New Method for Classification of Datasets for Data Mining

Decision tree is an important method for both induction research and data mining, which is mainly used for model classification and prediction. ID3 algorithm is the most widely used algorithm in the decision tree so far. In this paper, the…

Machine Learning · Computer Science 2016-12-02 Singh Vijendra , Hemjyotsana Parashar , Nisha Vasudeva

On Computing Optimal Tree Ensembles

Random forests and, more generally, (decision\nobreakdash-)tree ensembles are widely used methods for classification and regression. Recent algorithmic advances allow to compute decision trees that are optimal for various measures such as…

Machine Learning · Computer Science 2024-09-25 Christian Komusiewicz , Pascal Kunz , Frank Sommer , Manuel Sorge

Reinforced Decision Trees

In order to speed-up classification models when facing a large number of categories, one usual approach consists in organizing the categories in a particular structure, this structure being then used as a way to speed-up the prediction…

Machine Learning · Computer Science 2015-11-26 Aurélia Léon , Ludovic Denoyer

BEST : A decision tree algorithm that handles missing values

The main contribution of this paper is the development of a new decision tree algorithm. The proposed approach allows users to guide the algorithm through the data partitioning process. We believe this feature has many applications but in…

Machine Learning · Statistics 2020-10-27 Cédric Beaulac , Jeffrey S. Rosenthal

Active Learning Meets Optimized Item Selection

Designing recommendation systems with limited or no available training data remains a challenge. To that end, a new combinatorial optimization problem is formulated to generate optimized item selection for experimentation with the goal to…

Information Retrieval · Computer Science 2021-12-07 Bernard Kleynhans , Xin Wang , Serdar Kadıoğlu

Learning a Decision Tree Algorithm with Transformers

Decision trees are renowned for their ability to achieve high predictive performance while remaining interpretable, especially on tabular data. Traditionally, they are constructed through recursive algorithms, where they partition the data…

Machine Learning · Computer Science 2024-08-27 Yufan Zhuang , Liyuan Liu , Chandan Singh , Jingbo Shang , Jianfeng Gao

Maximal frequent itemset generation using segmentation approach

Finding frequent itemsets in a data source is a fundamental operation behind Association Rule Mining. Generally, many algorithms use either the bottom-up or top-down approaches for finding these frequent itemsets. When the length of…

Databases · Computer Science 2011-09-13 M. Rajalakshmi , Dr. T. Purusothaman , Dr. R. Nedunchezhian