Related papers: Longest common subsequences in binary sequences
The length of the longest common subsequences (LCSs) is often used as a similarity measurement to compare two (or more) random words. Below we study its statistical behavior in mean and variance using a Monte-Carlo approach from which we…
The Longest Common Subsequence (LCS) problem is a very important problem in math- ematics, which has a broad application in scheduling problems, physics and bioinformatics. It is known that the given two random sequences of infinite…
We consider the length L of the longest common subsequence of two randomly uniformly and independently chosen n character words over a k-ary alphabet. Subadditivity arguments yield that the expected value of L, when normalized by n,…
The Longest Common Subsequence (LCS) Problem asks for the longest sequence of (non-contiguous) matches between two given strings of characters. Using extensive Monte Carlo simulations, we find a finite size scaling law of the form E(L)/N =C…
Given two equally long, uniformly random binary strings, the expected length of their longest common subsequence (LCS) is asymptotically proportional to the strings' length. Finding the proportionality coefficient $\gamma$, i.e. the limit…
A repetition free Longest Common Subsequence (LCS) of two sequences x and y is an LCS of x and y where each symbol may appear at most once. Let R denote the length of a repetition free LCS of two sequences of n symbols each one chosen…
We study the generalized Chv\'atal-Sankoff constant $\gamma_{k,d}$, which represents the normalized expected length of the longest common subsequence (LCS) of $d$ independent uniformly random strings over an alphabet of size $k$. We derive…
Given two random finite sequences from $[k]^n$ such that a prefix of the first sequence is a suffix of the second, we examine the length of their longest common subsequence. If $\ell$ is the length of the overlap, we prove that the expected…
The Longest Common Subsequence (LCS) is a fundamental string similarity measure, and computing the LCS of two strings is a classic algorithms question. A textbook dynamic programming algorithm gives an exact algorithm in quadratic time, and…
Let $(X_k)_{k\geq 1}$ and $(Y_k)_{k\geq 1}$ be two independent sequences of i.i.d. random variables, with values in a finite and totally ordered alphabet $\mathcal{A}_m:=\{1,\dots,m\}$, and having respective probability mass function…
Let $(X_i)_{i \geq 1}$ and $(Y_i)_{i\geq1}$ be two independent sequences of independent identically distributed random variables taking their values in a common finite alphabet and having the same law. Let $LC_n$ be the length of the…
Let $X=(X_i)_{i\ge 1}$ and $Y=(Y_i)_{i\ge 1}$ be two sequences of independent and identically distributed (iid) random variables taking their values, uniformly, in a common totally ordered finite alphabet. Let LCI$_n$ be the length of the…
We study sketching and streaming algorithms for the Longest Common Subsequence problem (LCS) on strings of small alphabet size $|\Sigma|$. For the problem of deciding whether the LCS of strings $x,y$ has length at least $L$, we obtain a…
In this paper, we revisit the much studied LCS problem for two given sequences. Based on the algorithm of Iliopoulos and Rahman for solving the LCS problem, we have suggested 3 new improved algorithms. We first reformulate the problem in a…
Let $(X, Y) = (X_n, Y_n)_{n \geq 1}$ be the output process generated by a hidden chain $Z = (Z_n)_{n \geq 1}$, where $Z$ is a finite state, aperiodic, time homogeneous, and irreducible Markov chain. Let $LC_n$ be the length of the longest…
We provide upper and lower bounds for the expected length $\mathbb E(L_{n,m})$ of the longest common pattern contained in $m$ random permutations of length $n$. We also address the tightness of the concentration of $L_{n,m}$ around $\mathbb…
It has been proven that, when normalized by $n$, the expected length of a longest common subsequence of $d$ random strings of length $n$ over an alphabet of size $\sigma$ converges to some constant that depends only on $d$ and $\sigma$.…
It is well known that, when normalized by n, the expected length of a longest common subsequence of d sequences of length n over an alphabet of size sigma converges to a constant gamma_{sigma,d}. We disprove a speculation by Steele…
Let $X_1, X_2, ..., X_n, ... $ be a sequence of iid random variables with values in a finite alphabet $\{1,...,m\}$. Let $LI_n$ be the length of the longest increasing subsequence of $X_1, X_2, ..., X_n.$ We express the limiting…
In this note, we first introduce a new problem called the longest common subsequence and substring problem. Let $X$ and $Y$ be two strings over an alphabet $\Sigma$. The longest common subsequence and substring problem for $X$ and $Y$ is to…