English

Sequence length bounds for resolving a deep phylogenetic divergence

Populations and Evolution 2008-06-17 v1

Abstract

In evolutionary biology, genetic sequences carry with them a trace of the underlying tree that describes their evolution from a common ancestral sequence. The question of how many sequence sites are required to recover this evolutionary relationship accurately depends on the model of sequence evolution, the substitution rate, divergence times and the method used to infer phylogenetic history. A particularly challenging problem for phylogenetic methods arises when a rapid divergence event occurred in the distant past. We analyse an idealised form of this problem in which the terminal edges of a symmetric four--taxon tree are some factor (pp) times the length of the interior edge. We determine an order p2p^2 lower bound on the growth rate for the sequence length required to resolve the tree (independent of any particular branch length). We also show that this rate of sequence length growth can be achieved by existing methods (including the simple `maximum parsimony' method), and compare these order p2p^2 bounds with an order pp growth rate for a model that describes low-homoplasy evolution. In the final section, we provide a generic bound on the sequence length requirement for a more general class of Markov processes.

Keywords

Cite

@article{arxiv.0806.2500,
  title  = {Sequence length bounds for resolving a deep phylogenetic divergence},
  author = {Mareike Fischer and Mike Steel},
  journal= {arXiv preprint arXiv:0806.2500},
  year   = {2008}
}

Comments

13 pages, 1 figure

R2 v1 2026-06-21T10:50:52.281Z