Evolutionary Dataset Optimisation: learning algorithm quality through evolution

Henry Wilde; Vincent Knight; Jonathan Gillard

Evolutionary Dataset Optimisation: learning algorithm quality through evolution

Data Structures and Algorithms 2019-11-01 v3 Neural and Evolutionary Computing

Authors: Henry Wilde , Vincent Knight , Jonathan Gillard

Abstract

In this paper we propose a novel method for learning how algorithms perform. Classically, algorithms are compared on a finite number of existing (or newly simulated) benchmark datasets based on some fixed metrics. The algorithm(s) with the smallest value of this metric are chosen to be the `best performing'. We offer a new approach to flip this paradigm. We instead aim to gain a richer picture of the performance of an algorithm by generating artificial data through genetic evolution, the purpose of which is to create populations of datasets for which a particular algorithm performs well on a given metric. These datasets can be studied so as to learn what attributes lead to a particular progression of a given algorithm. Following a detailed description of the algorithm as well as a brief description of an open source implementation, a case study in clustering is presented. This case study demonstrates the performance and nuances of the method which we call Evolutionary Dataset Optimisation. In this study, a number of known properties about preferable datasets for the clustering algorithms known as (k)-means and DBSCAN are realised in the generated datasets.

Keywords

evolutionary algorithm algorithm selection genetic algorithm

Cite

@article{arxiv.1907.13508,
  title  = {Evolutionary Dataset Optimisation: learning algorithm quality through evolution},
  author = {Henry Wilde and Vincent Knight and Jonathan Gillard},
  journal= {arXiv preprint arXiv:1907.13508},
  year   = {2019}
}

Comments

33 pages, 15 figures

Evolutionary Dataset Optimisation: learning algorithm quality through evolution

Abstract

Keywords

Cite

Comments

Related papers