English

Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality

Neural and Evolutionary Computing 2020-01-31 v1 Machine Learning

Abstract

Manifold learning techniques have become increasingly valuable as data continues to grow in size. By discovering a lower-dimensional representation (embedding) of the structure of a dataset, manifold learning algorithms can substantially reduce the dimensionality of a dataset while preserving as much information as possible. However, state-of-the-art manifold learning algorithms are opaque in how they perform this transformation. Understanding the way in which the embedding relates to the original high-dimensional space is critical in exploratory data analysis. We previously proposed a Genetic Programming method that performed manifold learning by evolving mappings that are transparent and interpretable. This method required the dimensionality of the embedding to be known a priori, which makes it hard to use when little is known about a dataset. In this paper, we substantially extend our previous work, by introducing a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality. Our proposed approach is competitive with a range of baseline and state-of-the-art manifold learning methods, while also providing a range (front) of solutions that give different trade-offs between quality and dimensionality. Furthermore, the learned models are shown to often be simple and efficient, utilising only a small number of features in an interpretable manner.

Keywords

Cite

@article{arxiv.2001.01331,
  title  = {Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality},
  author = {Andrew Lensen and Mengjie Zhang and Bing Xue},
  journal= {arXiv preprint arXiv:2001.01331},
  year   = {2020}
}

Comments

31 pages, pre-print accepted by Genetic Programming and Evolvable Machines journal