English

An Optimal Algorithm for the Maximum-Density Segment Problem

Data Structures and Algorithms 2007-05-23 v1 Discrete Mathematics

Abstract

We address a fundamental problem arising from analysis of biomolecular sequences. The input consists of two numbers wminw_{\min} and wmaxw_{\max} and a sequence SS of nn number pairs (ai,wi)(a_i,w_i) with wi>0w_i>0. Let {\em segment} S(i,j)S(i,j) of SS be the consecutive subsequence of SS between indices ii and jj. The {\em density} of S(i,j)S(i,j) is d(i,j)=(ai+ai+1+...+aj)/(wi+wi+1+...+wj)d(i,j)=(a_i+a_{i+1}+...+a_j)/(w_i+w_{i+1}+...+w_j). The {\em maximum-density segment problem} is to find a maximum-density segment over all segments S(i,j)S(i,j) with wminwi+wi+1+...+wjwmaxw_{\min}\leq w_i+w_{i+1}+...+w_j \leq w_{\max}. The best previously known algorithm for the problem, due to Goldwasser, Kao, and Lu, runs in O(nlog(wmaxwmin+1))O(n\log(w_{\max}-w_{\min}+1)) time. In the present paper, we solve the problem in O(n) time. Our approach bypasses the complicated {\em right-skew decomposition}, introduced by Lin, Jiang, and Chao. As a result, our algorithm has the capability to process the input sequence in an online manner, which is an important feature for dealing with genome-scale sequences. Moreover, for a type of input sequences SS representable in O(m)O(m) space, we show how to exploit the sparsity of SS and solve the maximum-density segment problem for SS in O(m)O(m) time.

Keywords

Cite

@article{arxiv.cs/0311020,
  title  = {An Optimal Algorithm for the Maximum-Density Segment Problem},
  author = {Kai-min Chung and Hsueh-I Lu},
  journal= {arXiv preprint arXiv:cs/0311020},
  year   = {2007}
}

Comments

15 pages, 12 figures, an early version of this paper was presented at 11th Annual European Symposium on Algorithms (ESA 2003), Budapest, Hungary, September 15-20, 2003