The most parsimonious tree for random data

Mareike Fischer; Michelle Galla; Lina Herbst; Mike Steel

The most parsimonious tree for random data

Populations and Evolution 2014-06-03 v1

Authors: Mareike Fischer , Michelle Galla , Lina Herbst , Mike Steel

Abstract

Applying a method to reconstruct a phylogenetic tree from random data provides a way to detect whether that method has an inherent bias towards certain tree `shapes'. For maximum parsimony, applied to a sequence of random 2-state data, each possible binary phylogenetic tree has exactly the same distribution for its parsimony score. Despite this pleasing and slightly surprising symmetry, some binary phylogenetic trees are more likely than others to be a most parsimonious (MP) tree for a sequence of $k$ such characters, as we show. For $k=2$ , and unrooted binary trees on six taxa, any tree with a caterpillar shape has a higher chance of being an MP tree than any tree with a symmetric shape. On the other hand, if we take any two binary trees, on any number of taxa, we prove that this bias between the two trees vanishes as the number of characters grows. However, again there is a twist: MP trees on six taxa are more likely to have certain shapes than a uniform distribution on binary phylogenetic trees predicts, and this difference does not appear to dissipate as $k$ grows.

Keywords

phylogenetics population genetics and evolution genetic analysis

Cite

@article{arxiv.1406.0217,
  title  = {The most parsimonious tree for random data},
  author = {Mareike Fischer and Michelle Galla and Lina Herbst and Mike Steel},
  journal= {arXiv preprint arXiv:1406.0217},
  year   = {2014}
}

Comments

19 pages, 8 figures

The most parsimonious tree for random data

Abstract

Keywords

Cite

Comments

Related papers