English

MSARC: Multiple Sequence Alignment by Residue Clustering

Quantitative Methods 2013-07-31 v1

Abstract

Progressive methods offer efficient and reasonably good solutions to the multiple sequence alignment problem. However, resulting alignments are biased by guide-trees, especially for relatively distant sequences. We propose MSARC, a new graph-clustering based algorithm that aligns sequence sets without guide-trees. Experiments on the BAliBASE dataset show that MSARC achieves alignment quality similar to best progressive methods and substantially higher than the quality of other non-progressive algorithms. Furthermore, MSARC outperforms all other methods on sequence sets with the similarity structure hardly represented by a phylogenetic tree. Furthermore, MSARC outperforms all other methods on sequence sets whose evolutionary distances are hardly representable by a phylogenetic tree. These datasets are most exposed to the guide-tree bias of alignments. MSARC is available at http://bioputer.mimuw.edu.pl/msarc

Keywords

Cite

@article{arxiv.1307.7844,
  title  = {MSARC: Multiple Sequence Alignment by Residue Clustering},
  author = {Michał Modzelewski and Norbert Dojer},
  journal= {arXiv preprint arXiv:1307.7844},
  year   = {2013}
}

Comments

Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013)

R2 v1 2026-06-22T01:00:08.283Z