English
Related papers

Related papers: Substring Complexities on Run-length Compressed St…

200 papers

Shannon's entropy is a definitive lower bound for statistical compression. Unfortunately, no such clear measure exists for the compressibility of repetitive strings. Thus, ad hoc measures are employed to estimate the repetitiveness of…

Data Structures and Algorithms · Computer Science 2023-11-16 Giulia Bernardini , Gabriele Fici , Paweł Gawrychowski , Solon P. Pissis

The normalized substring complexity $\delta$ of a string is defined as $\max_k \{c[k]/k\}$, where $c[k]$ is the number of \textit{distinct} substrings of length $k$. This simply defined measure has recently attracted attention due to its…

Data Structures and Algorithms · Computer Science 2026-02-17 Gregory Kucherov , Yakov Nekrich

Suppose that we are given a string $s$ of length $n$ over an alphabet $\{0,1,\ldots,n^{O(1)}\}$ and $\delta$ is the string complexity of $s$, a known compression measure. We describe an index on $s$ with $O(\delta\log\frac{n}{\delta})$…

Data Structures and Algorithms · Computer Science 2026-04-15 Dmitry Kosolobov

In this work, we study the limits of compressed data structures, i.e., structures that support various queries on an input text $T\in\Sigma^n$ using space proportional to the size of $T$ in compressed form. Nearly all fundamental queries…

Data Structures and Algorithms · Computer Science 2025-10-23 Dominik Kempa , Tomasz Kociumaka

The random access problem for compressed strings is to build a data structure that efficiently supports accessing the character in position $i$ of a string given in compressed form. Given a grammar of size $n$ compressing a string of size…

Data Structures and Algorithms · Computer Science 2015-01-27 Patrick Hagge Cording

Given a string of length $n$ that is composed of $r$ runs of letters from the alphabet $\{0,1,\ldots,\sigma{-}1\}$ such that $2 \le \sigma \le r$, we describe a data structure that, provided $r \le n / \log^{\omega(1)} n$, stores the string…

Data Structures and Algorithms · Computer Science 2018-02-27 José Fuentes-Sepúlveda , Juha Kärkkäinen , Dmitry Kosolobov , Simon J. Puglisi

In the Shortest Superstring problem we are given a set of strings $S=\{s_1, \ldots, s_n\}$ and integer $\ell$ and the question is to decide whether there is a superstring $s$ of length at most $\ell$ containing all strings of $S$ as…

Data Structures and Algorithms · Computer Science 2015-02-06 Ivan Bliznets , Fedor V. Fomin , Petr A. Golovach , Nikolay Karpov , Alexander S. Kulikov , Saket Saurabh

Unlike in statistical compression, where Shannon's entropy is a definitive lower bound, no such clear measure exists for the compressibility of repetitive sequences. Since statistical entropy does not capture repetitiveness, ad-hoc measures…

Data Structures and Algorithms · Computer Science 2021-01-18 Tomasz Kociumaka , Gonzalo Navarro , Nicola Prezza

Suppose an oracle knows a string $S$ that is unknown to us and that we want to determine. The oracle can answer queries of the form "Is $s$ a substring of $S$?". In 1995, Skiena and Sundaram showed that, in the worst case, any algorithm…

Data Structures and Algorithms · Computer Science 2021-10-20 Gabriele Fici , Nicola Prezza , Rossano Venturini

In the classic longest common substring (LCS) problem, we are given two strings $S$ and $T$, each of length at most $n$, over an alphabet of size $\sigma$, and we are asked to find a longest string occurring as a fragment of both $S$ and…

Data Structures and Algorithms · Computer Science 2025-11-18 Panagiotis Charalampopoulos , Tomasz Kociumaka , Jakub Radoszewski , Solon P. Pissis

We study structure of pure morphic and morphic sequences and prove the following result: the subword complexity of arbitrary morphic sequence is either $\Theta(n^{1+1/k})$ for some $k\in\mathbb N$, or is $O(n \log n)$.

Combinatorics · Mathematics 2015-02-23 Rostislav Devyatov

Given a set of pattern strings $\mathcal{P}=\{P_1, P_2,\ldots P_k\}$ and a text string $S$, the classic dictionary matching problem is to report all occurrences of each pattern in $S$. We study the dictionary problem in the compressed…

Data Structures and Algorithms · Computer Science 2025-09-04 Philip Bille , Inge Li Gørtz , Simon J. Puglisi , Simon R. Tarnow

Real-world data often comes in compressed form. Analyzing compressed data directly (without decompressing it) can save space and time by orders of magnitude. In this work, we focus on fundamental sequence comparison problems and try to…

Data Structures and Algorithms · Computer Science 2021-12-14 Arun Ganesh , Tomasz Kociumaka , Andrea Lincoln , Barna Saha

We study the following substring suffix selection problem: given a substring of a string T of length n, compute its k-th lexicographically smallest suffix. This a natural generalization of the well-known question of computing the maximal…

Data Structures and Algorithms · Computer Science 2013-09-24 Maxim Babenko , Paweł Gawrychowski , Tomasz Kociumaka , Tatiana Starikovskaya

For $0<\delta <1$ a $\delta$-subrepetition in a word is a factor which exponent is less than~2 but is not less than $1+\delta$ (the exponent of the factor is the ratio of the factor length to its minimal period). The $\delta$-subrepetition…

Data Structures and Algorithms · Computer Science 2022-08-10 Roman Kolpakov

In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a base-compressor C gets a compressed output that is shorter than applying C over the entire T at once.…

Data Structures and Algorithms · Computer Science 2009-06-26 Paolo Ferragina , Igor Nitto , Rossano Venturini

String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set $\Gamma\subseteq [1..n]$ is a $k$-attractor for a string $S\in[1..\sigma]^n$ if and…

Data Structures and Algorithms · Computer Science 2020-12-09 Dominik Kempa , Alberto Policriti , Nicola Prezza , Eva Rotenberg

Two recent lower bounds on the compressibility of repetitive sequences, $\delta \le \gamma$, have received much attention. It has been shown that a length-$n$ string $S$ over an alphabet of size $\sigma$ can be represented within the…

Data Structures and Algorithms · Computer Science 2023-11-10 Tomasz Kociumaka , Gonzalo Navarro , Francisco Olivares

The problem of detecting and measuring the repetitiveness of one-dimensional strings has been extensively studied in data compression and text indexing. Our understanding of these issues has been significantly improved by the introduction…

Data Structures and Algorithms · Computer Science 2025-05-19 Lorenzo Carfagna , Giovanni Manzini , Giuseppe Romana , Marinella Sciortino , Cristian Urbina

Repeat finding in strings has important applications in subfields such as computational biology. Surprisingly, all prior work on repeat finding did not consider the constraint on the locality of repeats. In this paper, we propose and study…

Data Structures and Algorithms · Computer Science 2015-01-27 Atalay Mert İleri , M. Oğuzhan Külekci , Bojian Xu
‹ Prev 1 2 3 10 Next ›