Least Squares Methods for Equidistant Tree Reconstruction
Abstract
UPGMA is a heuristic method identifying the least squares equidistant phylogenetic tree given empirical distance data among taxa. We study this classic algorithm using the geometry of the space of all equidistant trees with leaves, also known as the Bergman complex of the graphical matroid for the complete graph . We show that UPGMA performs an orthogonal projection of the data onto a maximal cell of the Bergman complex. We also show that the equidistant tree with the least (Euclidean) distance from the data is obtained from such an orthogonal projection, but not necessarily given by UPGMA. Using this geometric information we give an extension of the UPGMA algorithm. We also present a branch and bound method for finding the best equidistant tree. Finally, we prove that there are distance data among taxa which project to at least equidistant trees.
Keywords
Cite
@article{arxiv.0808.3979,
title = {Least Squares Methods for Equidistant Tree Reconstruction},
author = {Conor Fahey and Serkan Hosten and Nathan Krieger and Leslie Timpe},
journal= {arXiv preprint arXiv:0808.3979},
year = {2008}
}