English

Multiple sequence alignment based on set covers

Quantitative Methods 2007-05-23 v1

Abstract

We introduce a new heuristic for the multiple alignment of a set of sequences. The heuristic is based on a set cover of the residue alphabet of the sequences, and also on the determination of a significant set of blocks comprising subsequences of the sequences to be aligned. These blocks are obtained with the aid of a new data structure, called a suffix-set tree, which is constructed from the input sequences with the guidance of the residue-alphabet set cover and generalizes the well-known suffix tree of the sequence set. We provide performance results on selected BAliBASE amino-acid sequences and compare them with those yielded by some prominent approaches.

Cite

@article{arxiv.q-bio/0412021,
  title  = {Multiple sequence alignment based on set covers},
  author = {A. H. L. Porto and V. C. Barbosa},
  journal= {arXiv preprint arXiv:q-bio/0412021},
  year   = {2007}
}