Related papers: Linear-Time Algorithms for Computing Maximum-Densi…
We address a fundamental problem arising from analysis of biomolecular sequences. The input consists of two numbers $w_{\min}$ and $w_{\max}$ and a sequence $S$ of $n$ number pairs $(a_i,w_i)$ with $w_i>0$. Let {\em segment} $S(i,j)$ of $S$…
We present algorithms for length-constrained maximum sum segment and maximum density segment problems, in particular, and the problem of finding length-constrained heaviest segments, in general, for a sequence of real numbers. Given a…
In this work, we obtain the following new results. 1. Given a sequence $D=((h_1,s_1), (h_2,s_2) ..., (h_n,s_n))$ of number pairs, where $s_i>0$ for all $i$, and a number $L_h$, we propose an O(n)-time algorithm for finding an index interval…
Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting a sequence in up to K segments. We want segments to be as monotonic as possible and to alternate signs. We propose a quality metric…
Several biological problems require the identification of regions in a sequence where some feature occurs within a target density range: examples including the location of GC-rich regions, identification of CpG islands, and sequence…
Given an array A containing arbitrary (positive and negative) numbers, we consider the problem of supporting range maximum-sum segment queries on A: i.e., given an arbitrary range [i,j], return the subrange [i' ,j' ] \subseteq [i,j] such…
Given a sequence of integers, we want to find a longest increasing subsequence of the sequence. It is known that this problem can be solved in $O(n \log n)$ time and space. Our goal in this paper is to reduce the space consumption while…
Sequence partition problems arise in many fields, such as sequential data analysis, information transmission, and parallel computing. In this paper, we study the following partition problem variant: given a sequence of $n$ items…
Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting an array in up to K segments. We want segments to be as monotonic as possible and to alternate signs. We propose a quality metric for…
Let $S$ be a string of length $n$ over an alphabet $\Sigma$ and let $Q$ be a subset of $\Sigma$ of size $q \geq 2$. The 'co-occurrence problem' is to construct a compact data structure that supports the following query: given an integer $w$…
The Subset Sum problem asks whether a given set of $n$ positive integers contains a subset of elements that sum up to a given target $t$. It is an outstanding open question whether the $O^*(2^{n/2})$-time algorithm for Subset Sum by…
We study the density estimation problem defined as follows: given $k$ distributions $p_1, \ldots, p_k$ over a discrete domain $[n]$, as well as a collection of samples chosen from a ``query'' distribution $q$ over $[n]$, output $p_i$ that…
We develop a randomized approximation algorithm for the classical maximum coverage problem, which given a list of sets $A_1,A_2,\cdots, A_m$ and integer parameter $k$, select $k$ sets $A_{i_1}, A_{i_2},\cdots, A_{i_k}$ for maximum union…
This paper describes a linear-time algorithm that finds the longest stretch in a sequence of real numbers (``scores'') in which the sum exceeds an input parameter. The algorithm also solves the problem of finding the longest interval in…
In this paper, we propose a data structure, a quadruple neighbor list (QN-list, for short), to support real time queries of all longest increasing subsequence (LIS) and LIS with constraints over sequential data streams. The QN-List built by…
The maximal sum of a sequence "A" of "n" real numbers is the greatest sum of all elements of any strictly contiguous and possibly empty subsequence of "A", and it can be computed in "O(n)" time by means of Kadane's algorithm. Letting "A^(x…
We revisit the range $\tau$-majority problem, which asks us to preprocess an array $A[1..n]$ for a fixed value of $\tau \in (0,1/2]$, such that for any query range $[i,j]$ we can return a position in $A$ of each distinct $\tau$-majority…
In this paper, we consider the problems for covering multiple intervals on a line. Given a set $B$ of $m$ line segments (called "barriers") on a horizontal line $L$ and another set $S$ of $n$ horizontal line segments of the same length in…
This paper considers the problem of maintaining statistic aggregates over the last W elements of a data stream. First, the problem of counting the number of 1's in the last W bits of a binary stream is considered. A lower bound of…
The bin packing problem is to find the minimum number of bins of size one to pack a list of items with sizes $a_1,..., a_n$ in $(0,1]$. Using uniform sampling, which selects a random element from the input list each time, we develop a…