Related papers: Learning Expressive Linkage Rules using Genetic Pr…
Entity matching is the task of linking records from different sources that refer to the same real-world entity. Past work has primarily treated entity linking as a standard supervised learning problem. However, supervised entity matching…
Data-efficient image classification is a challenging task that aims to solve image classification using small training data. Neural network-based deep learning methods are effective for image classification, but they typically require…
Gene expression programming, a genotype/phenotype genetic algorithm (linear and ramified), is presented here for the first time as a new technique for the creation of computer programs. Gene expression programming uses character linear…
Clustering is a difficult and widely-studied data mining task, with many varieties of clustering algorithms proposed in the literature. Nearly all algorithms use a similarity measure such as a distance metric (e.g. Euclidean distance) to…
The research community continues to seek increasingly more advanced synthetic data generators to reliably evaluate the strengths and limitations of machine learning methods. This work aims to increase the availability of datasets…
Gene expression programming is an evolutionary optimization algorithm with the potential to generate interpretable and easily implementable equations for regression problems. Despite knowledge gained from previous optimizations being…
Neural network models of real-world systems, such as industrial processes, made from sensor data must often rely on incomplete data. System states may not all be known, sensor data may be biased or noisy, and it is not often known which…
Our main contribution is to show that the behaviour of kernels across multiple layers of a convolutional neural network can be approximated using a logic program. The extracted logic programs yield accuracies that correlate with those of…
Our goal in this paper is to automatically extract a set of decision rules (rule set) that best explains a classification data set. First, a large set of decision rules is extracted from a set of decision trees trained on the data set. The…
Many machine learning models have been proposed to classify phenotypes from gene expression data. In addition to their good performance, these models can potentially provide some understanding of phenotypes by extracting explanations for…
Entity matching is the problem of identifying which records refer to the same real-world entity. It has been actively researched for decades, and a variety of different approaches have been developed. Even today, it remains a challenging…
Gene finding is the task of identifying the locations of coding sequences within the vast amount of genetic code contained in the genome. With an ever increasing quantity of raw genome sequences, gene finding is an important avenue towards…
A deep generative model such as a GAN learns to model a rich set of semantic and physical rules about the target distribution, but up to now, it has been obscure how such rules are encoded in the network, or how a rule could be changed. In…
Transfer learning has emerged as a powerful technique in many application problems, such as computer vision and natural language processing. However, this technique is largely ignored in application to genetic data analysis. In this paper,…
Discovery gene-disease links is important in biology and medicine areas, enabling disease identification and drug repurposing. Machine learning approaches accelerate this process by leveraging biological knowledge represented in ontologies…
Entity matching (EM) is a critical task in data integration, aiming to identify records across different datasets that refer to the same real-world entities. Traditional methods often rely on manually engineered features and rule-based…
Entity Linking in natural language processing seeks to match text entities to their corresponding entries in a dictionary or knowledge base. Traditional approaches rely on contextual models, which can be complex, hard to train, and have…
Evolutionary computation methods have been successfully applied to neural networks since two decades ago, while those methods cannot scale well to the modern deep neural networks due to the complicated architectures and large quantities of…
A key challenge in entity linking is making effective use of contextual information to disambiguate mentions that might refer to different entities in different contexts. We present a model that uses convolutional neural networks to capture…
Entity resolution (ER) is the process of identifying records that refer to the same entities within one or across multiple databases. Numerous techniques have been developed to tackle ER challenges over the years, with recent emphasis…