Related papers: Combinatorial Algorithms for String Sanitization
Let $W$ be a string of length $n$ over an alphabet $\Sigma$, $k$ be a positive integer, and $\mathcal{S}$ be a set of length-$k$ substrings of $W$. The ETFS problem asks us to construct a string $X_{\mathrm{ED}}$ such that: (i) no string of…
String analysis is the problem of reasoning about how strings are manipulated by a program. It has numerous applications including automatic detection of cross-site scripting (XSS). A popular string analysis technique includes symbolic…
Strings are a natural representation of biological data such as DNA, RNA and protein sequences. The problem of finding a string that summarizes a set of sequences has direct application in relative compression algorithms for genome and…
The literature on data sanitization aims to design algorithms that take an input dataset and produce a privacy-preserving version of it, that captures some of its statistical properties. In this note we study this question from a streaming…
In this paper we study the fundamental problem of maintaining a dynamic collection of strings under the following operations: concat - concatenates two strings, split - splits a string into two at a given position, compare - finds the…
String matching is the problem of finding all the substrings of a text which match a given pattern. It is one of the most investigated problems in computer science, mainly due to its very diverse applications in several fields. Recently,…
We revisit classic string problems considered in the area of parameterized complexity, and study them through the lens of dynamic data structures. That is, instead of asking for a static algorithm that solves the given instance efficiently,…
Determining whether an unordered collection of overlapping substrings (called shingles) can be uniquely decoded into a consistent string is a problem that lies within the foundation of a broad assortment of disciplines ranging from…
Algorithms to find optimal alignments among strings, or to find a parsimonious summary of a collection of strings, are well studied in a variety of contexts, addressing a wide range of interesting applications. In this paper, we consider…
This paper gives new results for synchronization strings, a powerful combinatorial object that allows to efficiently deal with insertions and deletions in various communication settings: $\bullet$ We give a deterministic, linear time…
Text sanitization, which employs differential privacy to replace sensitive tokens with new ones, represents a significant technique for privacy protection. Typically, its performance in preserving privacy is evaluated by measuring the…
The problem of approximate string matching is important in many different areas such as computational biology, text processing and pattern recognition. A great effort has been made to design efficient algorithms addressing several variants…
A key principle in string processing is local consistency: using short contexts to handle matching fragments of a string consistently. String synchronizing sets [Kempa, Kociumaka; STOC 2019] are an influential instantiation of this…
We consider the problem of efficiently designing sets (codes) of equal-length DNA strings (words) that satisfy certain combinatorial constraints. This problem has numerous motivations including DNA computing and DNA self-assembly. Previous…
String-averaging is an algorithmic structure used when handling a family of operators in situations where the algorithm at hand requires to employ the operators in a specific order. Sequential orderings are well-known and a simultaneous…
We present many new results related to reliable (interactive) communication over insertion-deletion channels. Synchronization errors, such as insertions and deletions, strictly generalize the usual symbol corruption errors and are much…
The widespread use of automated decision processes in many areas of our society raises serious ethical issues concerning the fairness of the process and the possible resulting discriminations. In this work, we propose a novel approach…
Online string matching is a computational problem involving the search for patterns or substrings in a large text dataset, with the pattern and text being processed sequentially, without prior access to the entire text. Its relevance stems…
The binary string matching problem consists in finding all the occurrences of a pattern in a text where both strings are built on a binary alphabet. This is an interesting problem in computer science, since binary data are omnipresent in…
Large-alphabet strings are common in scenarios such as information retrieval and natural-language processing. The efficient storage and processing of such strings usually introduces several challenges that are not witnessed in…