English

A new distance between DNA sequences

Populations and Evolution 2009-02-12 v1 Quantitative Methods

Abstract

We propose a new distance metric for DNA sequences, which can be defined on any evolutionary Markov model with infinitesimal generator matrix Q. That is the new metric can be defined under existing models such as Jukes-Cantor model, Kimura-2-parameter model, F84 model, GTR model etc. Since our metric does not depend on the form of the generator matrix Q, it can be defined for very general models including those with varying nucleotide substitution rates among lineages. This makes our metric widely applicable. The simulation experiments carried out shows that the new metric, when defined under classical models such as the JC, F84 and Kimura-2-parameter models, performs better than these existing metrics in recovering phylogenetic trees from sequence data. Our simulation experiments also show that the new metric, under a model that allows varying nucleotide substitution rates among lineages, performs equally well or better than its other forms studied.

Keywords

Cite

@article{arxiv.0902.1821,
  title  = {A new distance between DNA sequences},
  author = {Viswanath. C. Narayanan},
  journal= {arXiv preprint arXiv:0902.1821},
  year   = {2009}
}

Comments

18 pages

R2 v1 2026-06-21T12:10:04.958Z