Related papers: Coding methods for string reconstruction from erro…
Motivated by applications in polymer-based data storage, we study the problem of reconstructing a string from part of its composition multiset. We give a full description of the structure of the strings that cannot be uniquely reconstructed…
The problem of string reconstruction from substring information has found many applications due to its relevance in DNA- and polymer-based data storage. One practically important and challenging paradigm requires reconstructing mixtures of…
Motivated by studies of data retrieval in polymer-based storage systems, we consider the problem of reconstructing a multiset of binary strings that have the same length and the same weight from the compositions of their prefixes and…
The problem of reconstructing strings from substring information has found many applications due to its importance in genomic data sequencing and DNA- and polymer-based data storage. One practically important and challenging paradigm…
We consider the problem of binary string reconstruction from the multiset of its substring compositions, i.e., referred to as the substring composition multiset, first introduced and studied by Acharya et al. We introduce a new algorithm…
Motivated by mass-spectrometry protein sequencing, we consider a simply-stated problem of reconstructing a string from the multiset of its substring compositions. We show that all strings of length 7, one less than a prime, or one less than…
Motivated by polymer-based data-storage platforms that use chains of binary synthetic polymers as the recording media and read the content via tandem mass spectrometers, we propose a new family of codes that allows for unique string…
The problem of reconstructing strings from their substring spectra has a long history and in its most simple incarnation asks for determining under which conditions the spectrum uniquely determines the string. We study the problem of coded…
We consider the problem of correcting mass readout errors in information encoded in binary polymer strings. Our work builds on results for string reconstruction problems using composition multisets [Acharya et al., 2015] and the unique…
Motivated by polymer-based data-storage platforms that use chains of binary synthetic polymers as the recording media and read the content via tandem mass spectrometers, we propose a new family of codes that allows for both unique string…
This paper studies reconstruction of strings based upon their substrings spectrum. Under this paradigm, it is assumed that all substrings of some fixed length are received and the goal is to reconstruct the string. While many existing works…
We investigate the construction of prefix-free and fix-free codes with specified codeword compositions. We present a polynomial time algorithm which constructs a fix-free code with the same codeword compositions as a given code for a…
We study the problems of finding a shortest synchronizing word and its length for a given prefix code. This is done in two different settings: when the code is defined by an arbitrary decoder recognizing its star and when the code is…
This paper describes a new method of data encoding which may be used in various modern digital, computer and telecommunication systems and devices. The method permits the compression of data for storage or transmission, allowing the exact…
A new method for constructing minimum-redundancy binary prefix codes is described. Our method does not explicitly build a Huffman tree; instead it uses a property of optimal prefix codes to compute the codeword lengths corresponding to the…
In this paper, we investigate binary reconstruction codes capable of correcting one deletion and one substitution. We define the \emph{single-deletion single-substitution ball} function $ \mathcal{B} $ as a mapping from a sequence to the…
Synthetic polymer-based storage seems to be a particularly promising candidate that could help to cope with the ever-increasing demand for archival storage requirements. It involves designing molecules of distinct masses to represent the…
Huffman coding finds an optimal prefix code for a given probability mass function. Consider situations in which one wishes to find an optimal code with the restriction that all codewords have lengths that lie in a user-specified set of…
We generalize the problem of reconstructing strings from their substring compositions first introduced by Acharya et al. in 2015 motivated by polymer-based advanced data storage systems utilizing mass spectrometry. Namely, we see strings as…
Consider two or more strings $\mathbf{x}^1,\mathbf{x}^2,\ldots,$ that are concatenated to form $\mathbf{x}=\langle \mathbf{x}^1,\mathbf{x}^2,\ldots \rangle$. Suppose that up to $\delta$ deletions occur in each of the concatenated strings.…