English

Small Coupling Expansion for Multiple Sequence Alignment

Quantitative Methods 2023-05-01 v2 Disordered Systems and Neural Networks Biological Physics Biomolecules

Abstract

The alignment of biological sequences such as DNA, RNA, and proteins, is one of the basic tools that allow to detect evolutionary patterns, as well as functional/structural characterizations between homologous sequences in different organisms. Typically, state-of-the-art bioinformatics tools are based on profile models that assume the statistical independence of the different sites of the sequences. Over the last years, it has become increasingly clear that homologous sequences show complex patterns of long-range correlations over the primary sequence as a consequence of the natural evolution process that selects genetic variants under the constraint of preserving the functional/structural determinants of the sequence. Here, we present a new alignment algorithm based on message passing techniques that overcomes the limitations of profile models. Our method is based on a new perturbative small-coupling expansion of the free energy of the model that assumes a linear chain approximation as the 0th0^\mathrm{th}-order of the expansion. We test the potentiality of the algorithm against standard competing strategies on several biological sequences.

Keywords

Cite

@article{arxiv.2210.03463,
  title  = {Small Coupling Expansion for Multiple Sequence Alignment},
  author = {Louise Budzynski and Andrea Pagnani},
  journal= {arXiv preprint arXiv:2210.03463},
  year   = {2023}
}

Comments

22 pages, 10 figures