Related papers: Data quality measurement on categorical data using…
With the growing size of data sets, feature selection becomes increasingly important. Taking interactions of original features into consideration will lead to extremely high dimension, especially when the features are categorical and…
In general frequent itemsets are generated from large data sets by applying association rule mining algorithms like Apriori, Partition, Pincer-Search, Incremental, Border algorithm etc., which take too much computer time to compute all the…
The research community continues to seek increasingly more advanced synthetic data generators to reliably evaluate the strengths and limitations of machine learning methods. This work aims to increase the availability of datasets…
Genetic algorithms are a powerful tool in optimization for single and multi-modal functions. This paper provides an overview of their fundamentals with some analytical examples. In addition, we explore how they can be used as a parameter…
In general, we can not use algebraic or enumerative methods to optimize a quality control (QC) procedure so as to detect the critical random and systematic analytical errors with stated probabilities, while the probability for false…
Analyzing large datasets to select optimal features is one of the most important research areas in machine learning and data mining. This feature selection procedure involves dimensionality reduction which is crucial in enhancing the…
This thesis investigates the use of problem-specific knowledge to enhance a genetic algorithm approach to multiple-choice optimisation problems.It shows that such information can significantly enhance performance, but that the choice of…
The talk describes a general approach of a genetic algorithm for multiple objective optimization problems. A particular dominance relation between the individuals of the population is used to define a fitness operator, enabling the genetic…
Multi-model inference covers a wide range of modern statistical applications such as variable selection, model confidence set, model averaging and variable importance. The performance of multi-model inference depends on the availability of…
When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover…
Cellular automata are discrete and computational models thatcan be shown as general models of complexity. They are used in varied applications to derive the generalized behavior of the presented model. In this paper we have took one such…
Software Testing is a process to identify the quality and reliability of software, which can be achieved through the help of proper test data. However, doing this manually is a difficult task due to the presence of number of predicate nodes…
Over the years, data mining has attracted most of the attention from the research community. The researchers attempt to develop faster, more scalable algorithms to navigate over the ever increasing volumes of spatial gene expression data in…
The selection of features is an essential data preprocessing stage in data mining. The core principle of feature selection seems to be to pick a subset of possible features by excluding features with almost no predictive information as well…
Genetic algorithms are heuristic optimization techniques inspired by Darwinian evolution, which are characterized by successfully finding robust solutions for optimization problems. Here, we propose a subroutine-based quantum genetic…
The integration of advanced technologies, such as Artificial Intelligence (AI), into manufacturing processes is attracting significant attention, paving the way for the development of intelligent systems that enhance efficiency and…
Association Rule mining is one of the most important fields in data mining and knowledge discovery. This paper proposes an algorithm that combines the simple association rules derived from basic Apriori Algorithm with the multiple minimum…
We propose and apply a novel paradigm for characterization of genome data quality, which quantifies the effects of intentional degradation of quality. The rationale is that the higher the initial quality, the more fragile the genome and the…
An approach to the classification problem of machine learning, based on building local classification rules, is developed. The local rules are considered as projections of the global classification rules to the event we want to classify. A…
Numerical Association Rule Mining is a popular variant of Association Rule Mining, where numerical attributes are handled without discretization. This means that the algorithms for dealing with this problem can operate directly, not only…