Related papers: A Fast Minimal Infrequent Itemset Mining Algorithm

Fast Algorithms for Mining Interesting Frequent Itemsets without Minimum Support

Real world datasets are sparse, dirty and contain hundreds of items. In such situations, discovering interesting rules (results) using traditional frequent itemset mining approach by specifying a user defined input support threshold is not…

Databases · Computer Science 2009-04-22 Shariq Bashir , Zahoor Jan , Abdul Rauf Baig

Fast Randomized Subspace System Identification for Large I/O Data

In this article, a novel fast randomized subspace system identification method for estimating combined deterministic-stochastic LTI state-space models, is proposed. The algorithm is especially well-suited to identify high-order and…

Systems and Control · Electrical Eng. & Systems 2023-12-12 Vatsal Kedia , Debraj Chakraborty

DiffNodesets: An Efficient Structure for Fast Mining Frequent Itemsets

Mining frequent itemsets is an essential problem in data mining and plays an important role in many data mining applications. In recent years, some itemset representations based on node sets have been proposed, which have shown to be very…

Data Structures and Algorithms · Computer Science 2018-01-12 Zhi-Hong Deng

An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets

As advances in technology allow for the collection, storage, and analysis of vast amounts of data, the task of screening and assessing the significance of discovered patterns is becoming a major challenge in data mining applications. In…

Databases · Computer Science 2010-02-08 Adam Kirsch , Michael Mitzenmacher , Andrea Pietracaprina , Geppino Pucci , Eli Upfal , Fabio Vandin

Mining Frequent Itemsets from Secondary Memory

Mining frequent itemsets is at the core of mining association rules, and is by now quite well understood algorithmically. However, most algorithms for mining frequent itemsets assume that the main memory is large enough for the data…

Databases · Computer Science 2016-08-16 Gösta Grahne , Jianfei Zhu

Parallel algorithms for mining of frequent itemsets

In the recent decade companies started collecting of large amount of data. Without a proper analyse, the data are usually useless. The field of analysing the data is called data mining. Unfortunately, the amount of data is quite large: the…

Databases · Computer Science 2021-08-12 Robert Kessl

A Fast Greedy Algorithm for Outlier Mining

The task of outlier detection is to find small groups of data objects that are exceptional when compared with rest large amount of data. In [38], the problem of outlier detection in categorical data is defined as an optimization problem and…

Databases · Computer Science 2007-05-23 Zengyou He , Xiaofei Xu , Shengchun Deng

Finding the True Frequent Itemsets

Frequent Itemsets (FIs) mining is a fundamental primitive in data mining. It requires to identify all itemsets appearing in at least a fraction $\theta$ of a transactional dataset $\mathcal{D}$. Often though, the ultimate goal of mining…

Machine Learning · Computer Science 2014-01-23 Matteo Riondato , Fabio Vandin

Discovery of Maximal Frequent Item Sets using Subset Creation

Data mining is the practice to search large amount of data to discover data patterns. Data mining uses mathematical algorithms to group the data and evaluate the future events. Association rule is a research area in the field of knowledge…

Databases · Computer Science 2013-02-08 Jnanamurthy H. K.

An Efficient Genetic Algorithm for Discovering Diverse-Frequent Patterns

Working with exhaustive search on large dataset is infeasible for several reasons. Recently, developed techniques that made pattern set mining feasible by a general solver with long execution time that supports heuristic search and are…

Artificial Intelligence · Computer Science 2015-07-21 Shanjida Khatun , Hasib Ul Alam , Swakkhar Shatabda

HybridMiner: Mining Maximal Frequent Itemsets Using Hybrid Database Representation Approach

In this paper we present a novel hybrid (arraybased layout and vertical bitmap layout) database representation approach for mining complete Maximal Frequent Itemset (MFI) on sparse and large datasets. Our work is novel in terms of…

Databases · Computer Science 2016-11-17 Shariq Bashir , Abdul Rauf Baig

Efficient indexing and searching of high dimensional data has been an area of active research due to the growing exploitation of high dimensional data and the vulnerability of traditional search methods to the curse of dimensionality. This…

Information Retrieval · Computer Science 2015-05-13 Yu Zhong

Maximal frequent itemset generation using segmentation approach

Finding frequent itemsets in a data source is a fundamental operation behind Association Rule Mining. Generally, many algorithms use either the bottom-up or top-down approaches for finding these frequent itemsets. When the length of…

Databases · Computer Science 2011-09-13 M. Rajalakshmi , Dr. T. Purusothaman , Dr. R. Nedunchezhian

Pattern Detection with Rare Item-set Mining

The discovery of new and interesting patterns in large datasets, known as data mining, draws more and more interest as the quantities of available data are exploding. Data mining techniques may be applied to different domains and fields…

Software Engineering · Computer Science 2012-09-17 Mehdi Adda , Lei Wu , Sharon White , Yi Feng

An efficient mining scheme for high utility itemsets

Knowledge discovery in databases aims at finding useful information, which can be deployed for decision making. The problem of high utility itemset mining has specifically garnered huge research focus in the past decade, as it aims to find…

Databases · Computer Science 2023-08-30 Pushp , Satish Chand

Frequent Itemset Mining with Multiple Minimum Supports: a Constraint-based Approach

The problem of discovering frequent itemsets including rare ones has received a great deal of attention. The mining process needs to be flexible enough to extract frequent and rare regularities at once. On the other hand, it has recently…

Artificial Intelligence · Computer Science 2021-09-17 Mohamed-Bachir Belaid , Nadjib Lazaar

Indexing Schemes for Similarity Search In Datasets of Short Protein Fragments

We propose a family of very efficient hierarchical indexing schemes for ungapped, score matrix-based similarity search in large datasets of short (4-12 amino acid) protein fragments. This type of similarity search has importance in both…

Data Structures and Algorithms · Computer Science 2007-09-04 Aleksandar Stojmirovic , Vladimir Pestov

Incremental Mining of Frequent Serial Episodes Considering Multiple Occurrences

The need to analyze information from streams arises in a variety of applications. One of its fundamental research directions is to mine sequential patterns over data streams. Current studies mine series of items based on the presence of the…

Databases · Computer Science 2022-04-12 Thomas Guyet , Wenbin Zhang , Albert Bifet

Frequent Itemset Mining using QUBO

In this paper we propose a R-step approximation to solve frequent itemset mining on quantum hardware like quantum annealing or QAOA. The idea is to search for the set of items where the minimal 2-item frequency is maximal. This can be…

Databases · Computer Science 2022-11-29 Jonas Nüßlein

Fast Combinatorial Algorithms for Min Max Correlation Clustering

We introduce fast algorithms for correlation clustering with respect to the Min Max objective that provide constant factor approximations on complete graphs. Our algorithms are the first purely combinatorial approximation algorithms for…

Data Structures and Algorithms · Computer Science 2023-01-31 Sami Davies , Benjamin Moseley , Heather Newman