Related papers: HybridMiner: Mining Maximal Frequent Itemsets Usin…

Ramp: Fast Frequent Itemset Mining with Efficient Bit-Vector Projection Technique

Mining frequent itemset using bit-vector representation approach is very efficient for dense type datasets, but highly inefficient for sparse datasets due to lack of any efficient bit-vector projection technique. In this paper we present a…

Databases · Computer Science 2009-04-22 Shariq Bashir , Abdul Rauf Baig

DiffNodesets: An Efficient Structure for Fast Mining Frequent Itemsets

Mining frequent itemsets is an essential problem in data mining and plays an important role in many data mining applications. In recent years, some itemset representations based on node sets have been proposed, which have shown to be very…

Data Structures and Algorithms · Computer Science 2018-01-12 Zhi-Hong Deng

FastLMFI: An Efficient Approach for Local Maximal Patterns Propagation and Maximal Patterns Superset Checking

Maximal frequent patterns superset checking plays an important role in the efficient mining of complete Maximal Frequent Itemsets (MFI) and maximal search space pruning. In this paper we present a new indexing approach, FastLMFI for local…

Databases · Computer Science 2016-11-17 Shariq Bashir , Abdul Rauf Baig

An efficient mining scheme for high utility itemsets

Knowledge discovery in databases aims at finding useful information, which can be deployed for decision making. The problem of high utility itemset mining has specifically garnered huge research focus in the past decade, as it aims to find…

Databases · Computer Science 2023-08-30 Pushp , Satish Chand

HI-Series Algorithms A Hybrid of Substance Diffusion Algorithm and Collaborative Filtering

Recommendation systems face the challenge of balancing accuracy and diversity, as traditional collaborative filtering (CF) and network-based diffusion algorithms exhibit complementary limitations. While item-based CF (ItemCF) enhances…

Information Retrieval · Computer Science 2025-03-04 Yu Peng , Ya-Hui An

Efficient Inner Product Approximation in Hybrid Spaces

Many emerging use cases of data mining and machine learning operate on large datasets with data from heterogeneous sources, specifically with both sparse and dense components. For example, dense deep neural network embedding vectors are…

Machine Learning · Computer Science 2019-03-22 Xiang Wu , Ruiqi Guo , David Simcha , Dave Dopson , Sanjiv Kumar

A Fast Minimal Infrequent Itemset Mining Algorithm

A novel fast algorithm for finding quasi identifiers in large datasets is presented. Performance measurements on a broad range of datasets demonstrate substantial reductions in run-time relative to the state of the art and the scalability…

Databases · Computer Science 2014-10-17 Kostyantyn Demchuk , Douglas J. Leith

Fast Algorithms for Mining Interesting Frequent Itemsets without Minimum Support

Real world datasets are sparse, dirty and contain hundreds of items. In such situations, discovering interesting rules (results) using traditional frequent itemset mining approach by specifying a user defined input support threshold is not…

Databases · Computer Science 2009-04-22 Shariq Bashir , Zahoor Jan , Abdul Rauf Baig

Higher Order Mutual Information Approximation for Feature Selection

Feature selection is a process of choosing a subset of relevant features so that the quality of prediction models can be improved. An extensive body of work exists on information-theoretic feature selection, based on maximizing Mutual…

Machine Learning · Computer Science 2016-12-05 Jilin Wu , Soumyajit Gupta , Chandrajit Bajaj

Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms

Diversity maximization aims to select a diverse and representative subset of items from a large dataset. It is a fundamental optimization task that finds applications in data summarization, feature selection, web search, recommender…

Data Structures and Algorithms · Computer Science 2023-04-27 Yanhao Wang , Michael Mathioudakis , Jia Li , Francesco Fabbri

Discovery of Maximal Frequent Item Sets using Subset Creation

Data mining is the practice to search large amount of data to discover data patterns. Data mining uses mathematical algorithms to group the data and evaluate the future events. Association rule is a research area in the field of knowledge…

Databases · Computer Science 2013-02-08 Jnanamurthy H. K.

Multi-Sorted Inverse Frequent Itemsets Mining

The development of novel platforms and techniques for emerging "Big Data" applications requires the availability of real-life datasets for data-driven experiments, which are however out of reach for academic research in most cases as they…

Databases · Computer Science 2013-10-16 Domenico Sacca' , Edoardo Serra , Pietro Dicosta , Antonio Piccolo

Incremental high average-utility itemset mining: survey and challenges

The High Average Utility Itemset Mining (HAUIM) technique, a variation of High Utility Itemset Mining (HUIM), uses the average utility of the itemsets. Historically, most HAUIM algorithms were designed for static databases. However,…

Databases · Computer Science 2024-07-17 Jing Chen , Shengyi Yang , Weiping Ding , Peng Li , Aijun Liu , Hongjun Zhang , Tian Li

The Hybrid Multimodal Graph Index (HMGI): A Comprehensive Framework for Integrated Relational and Vector Search

The proliferation of complex, multimodal datasets has exposed a critical gap between the capabilities of specialized vector databases and traditional graph databases. While vector databases excel at semantic similarity search, they lack the…

Databases · Computer Science 2025-10-14 Joydeep Chandra , Satyam Kumar Navneet , Yong Zhang

Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for Loss-free Multi-Exposure Image Fusion

Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels. Despite its advancements, the field grapples with challenges, notably the reliance…

Computer Vision and Pattern Recognition · Computer Science 2023-09-06 Guanyao Wu , Hongming Fu , Jinyuan Liu , Long Ma , Xin Fan , Risheng Liu

A one-phase tree-based algorithm for mining high-utility itemsets from a transaction database

High-utility itemset mining finds itemsets from a transaction database with utility no less than a fixed user-defined threshold. The utility of an itemset is defined as the sum of the utilities of its item. Several algorithms were proposed…

Data Structures and Algorithms · Computer Science 2019-11-19 Siddharth Dawar , Vikram Goyal , Debajyoti Bera

A novel approach for fast mining frequent itemsets use N-list structure based on MapReduce

Frequent Pattern Mining is a one field of the most significant topics in data mining. In recent years, many algorithms have been proposed for mining frequent itemsets. A new algorithm has been presented for mining frequent itemsets based on…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-23 Arkan A. G. Al-Hamodi , Songfeng Lu

Unsupervised Deep Hashing for Large-scale Visual Search

Learning based hashing plays a pivotal role in large-scale visual search. However, most existing hashing algorithms tend to learn shallow models that do not seek representative binary codes. In this paper, we propose a novel hashing…

Computer Vision and Pattern Recognition · Computer Science 2018-04-26 Zhaoqiang Xia , Xiaoyi Feng , Jinye Peng , Abdenour Hadid

Improved Search in Hamming Space using Deep Multi-Index Hashing

Similarity-preserving hashing is a widely-used method for nearest neighbour search in large-scale image retrieval tasks. There has been considerable research on generating efficient image representation via the deep-network-based hashing…

Computer Vision and Pattern Recognition · Computer Science 2017-10-20 Hanjiang Lai , Yan Pan

Mining Frequent Itemsets from Secondary Memory

Mining frequent itemsets is at the core of mining association rules, and is by now quite well understood algorithmically. However, most algorithms for mining frequent itemsets assume that the main memory is large enough for the data…

Databases · Computer Science 2016-08-16 Gösta Grahne , Jianfei Zhu