Related papers: Leveraging Data Mining Algorithms to Recommend Sou…

Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms

Frequent itemset mining is a popular data mining technique. Apriori, Eclat, and FP-Growth are among the most common algorithms for frequent itemset mining. Considerable research has been performed to compare the relative performance between…

Databases · Computer Science 2017-02-01 Jeff Heaton

Evaluation of Frequent Itemset Mining Platforms using Apriori and FP-Growth Algorithm

With the overwhelming amount of complex and heterogeneous data pouring from any-where, any-time, and any-device, there is undeniably an era of Big Data. The emergence of the Big Data as a disruptive technology for next generation of…

Databases · Computer Science 2019-03-01 Ravi Ranjan , Aditi Sharma

Learning and Suggesting Source Code Changes from Version History: A Systematic Review

Context: Software systems are in continuous evolution through source code changes to fixing bugs, adding new functionalities and improving the internal architecture. All these practices are recorded in the version history, which can be…

Software Engineering · Computer Science 2020-01-17 Leandro Ungari Cayres , Bruno Santos de Lima , Rogério Eduardo Garcia

Foundation for Frequent Pattern Mining Algorithms Implementation

As with the development of the IT technologies, the amount of accumulated data is also increasing. Thus the role of data mining comes into picture. Association rule mining becomes one of the significant responsibilities of descriptive…

Databases · Computer Science 2014-02-11 Prof. Paresh Tanna , Dr. Yogesh Ghodasara

Data mining has been widely recognized as a powerful tool to explore added value from large-scale databases. Finding frequent item sets in databases is a crucial in data mining process of extracting association rules. Many algorithms were…

Databases · Computer Science 2010-03-23 M. S. Danessh , C. Balasubramanian , K. Duraiswamy

RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework

Initially, a number of frequent itemset mining (FIM) algorithms have been designed on the Hadoop MapReduce, a distributed big data processing framework. But, due to heavy disk I/O, MapReduce is found to be inefficient for such highly…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-12-16 Pankaj Singh , Sudhakar Singh , P. K. Mishra , Rakhi Garg

DiffSearch: A Scalable and Precise Search Engine for Code Changes

The source code of successful projects is evolving all the time, resulting in hundreds of thousands of code changes stored in source code repositories. This wealth of data can be useful, e.g., to find changes similar to a planned code…

Software Engineering · Computer Science 2022-11-01 Luca Di Grazia , Paul Bredl , Michael Pradel

A Systematic Review of Automated Query Reformulations in Source Code Search

Fixing software bugs and adding new features are two of the major maintenance tasks. Software bugs and features are reported as change requests. Developers consult these requests and often choose a few keywords from them as an ad hoc query.…

Software Engineering · Computer Science 2023-06-12 Mohammad Masudur Rahman , Chanchal K. Roy

How Do Code Changes Evolve in Different Platforms? A Mining-based Investigation

Code changes are performed differently in the mobile and non-mobile platforms. Prior work has investigated the differences in specific platforms. However, we still lack a deeper understanding of how code changes evolve across different…

Software Engineering · Computer Science 2019-10-28 Markos Viggiato , Johnatan Oliveira , Eduardo Figueiredo , Pooyan Jamshidi , Christian Kästner

Modified Apriori Graph Algorithm for Frequent Pattern Mining

Web Usage Mining is an application of Data Mining Techniques to discover interesting usage patterns from web data in order to understand and better serve the needs of web-based applications. The paper proposes an algorithm for finding these…

Artificial Intelligence · Computer Science 2018-05-01 Pritish Yuvraj , Suneetha K. R

An Enhanced Apriori Algorithm for Discovering Frequent Patterns with Optimal Number of Scans

Data mining is wide spreading its applications in several areas. There are different tasks in mining which provides solutions for wide variety of problems in order to discover knowledge. Among those tasks association mining plays a pivotal…

Databases · Computer Science 2015-06-24 Sudhir Tirumalasetty , Aruna Jadda , Sreenivasa Reddy Edara

The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring

Refactoring is the process of changing the internal structure of software to improve its quality without modifying its external behavior. Empirical studies have repeatedly shown that refactoring has a positive impact on the…

Software Engineering · Computer Science 2020-09-14 Maurício Aniche , Erick Maziero , Rafael Durelli , Vinicius Durelli

Identifying Source Code File Experts

In software development, the identification of source code file experts is an important task. Identifying these experts helps to improve software maintenance and evolution activities, such as developing new features, code reviews, and bug…

Software Engineering · Computer Science 2022-08-17 Otávio Cury , Guilherme Avelino , Pedro Santos Neto , Ricardo Britto , Marco Túlio Valente

RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework (Extended Version)

Frequent itemset mining (FIM) is a highly computational and data intensive algorithm. Therefore, parallel and distributed FIM algorithms have been designed to process large volume of data in a reduced time. Recently, a number of FIM…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-26 Pankaj Singh , Sudhakar Singh , P K Mishra , Rakhi Garg

Toward Interactive Optimization of Source Code Differences: An Empirical Study of Its Performance

A source code difference (diff) indicates changes made by comparing new and old source codes, and it can be utilized in code reviews to help developers understand the changes made to the code. Although many diff generation methods have been…

Software Engineering · Computer Science 2024-09-27 Tsukasa Yagi , Shinpei Hayashi

Oops!... I did it again. Conclusion (In-)Stability in Quantitative Empirical Software Engineering: A Large-Scale Analysis

Context: Mining software repositories is a popular means to gain insights into a software project's evolution, monitor project health, support decisions and derive best practices. Tools supporting the mining process are commonly applied by…

Software Engineering · Computer Science 2025-11-13 Nicole Hoess , Carlos Paradis , Rick Kazman , Wolfgang Mauerer

A Data Set of Generalizable Python Code Change Patterns

Mining repetitive code changes from version control history is a common way of discovering unknown change patterns. Such change patterns can be used in code recommender systems or automated program repair techniques. While there are such…

Software Engineering · Computer Science 2023-04-12 Akalanka Galappaththi , Sarah Nadi

The Application of Data Mining in the Production Processes

Traditional statistical and measurements are unable to solve all industrial data in the right way and appropriate time. Open markets mean the customers are increased, and production must increase to provide all customer requirements.…

General Economics · Economics 2020-11-26 Hamza Saad

Review of Apriori Based Algorithms on MapReduce Framework

The Apriori algorithm that mines frequent itemsets is one of the most popular and widely used data mining algorithms. Now days many algorithms have been proposed on parallel and distributed platforms to enhance the performance of Apriori…

Databases · Computer Science 2017-02-22 Sudhakar Singh , Rakhi Garg , P. K. Mishra

Using Apriori with WEKA for Frequent Pattern Mining

Knowledge exploration from the large set of data,generated as a result of the various data processing activities due to data mining only. Frequent Pattern Mining is a very important undertaking in data mining. Apriori approach applied to…

Databases · Computer Science 2014-07-01 Paresh Tanna , Yogesh Ghodasara