Related papers: A Note on Nested String Replacements

A Novel Algorithm for String Matching with Mismatches

We present an online algorithm to deal with pattern matching in strings. The problem we investigate is commonly known as string matching with mismatches in which the objective is to report the number of characters that match when a pattern…

Data Structures and Algorithms · Computer Science 2016-03-11 Vinodprasad P

String Consensus Problems with Swaps and Substitutions

String consensus problems aim at finding a string that minimizes some given distance with respect to an input set of strings. In particular, in the Closest string problem, we are given a set of strings of equal length and a radius $d$. The…

Data Structures and Algorithms · Computer Science 2025-07-29 Estéban Gabory , Laurent Bulteau , Gabriele Fici , Hilde Verbeek

Efficient Online String Matching Based on Characters Distance Text Sampling

Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. Sampled string…

Data Structures and Algorithms · Computer Science 2019-08-19 Simone Faro , Arianna Pavone , Francesco Pio Marino

Computer-Aided Multi-Stroke Character Simplification by Stroke Removal

Multi-stroke characters in scripts such as Chinese and Japanese can be highly complex, posing significant challenges for both native speakers and, especially, non-native learners. If these characters can be simplified without degrading…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Ryo Ishiyama , Shinnosuke Matsuo , Seiichi Uchida

Many Flavors of Edit Distance

Several measures exist for string similarity, including notable ones like the edit distance and the indel distance. The former measures the count of insertions, deletions, and substitutions required to transform one string into another,…

Data Structures and Algorithms · Computer Science 2024-10-15 Sudatta Bhattacharya , Sanjana Dey , Elazar Goldenberg , Michal Koucký

A Simple Reduction for Full-Permuted Pattern Matching Problems on Multi-Track Strings

In this paper we study a variant of string pattern matching which deals with tuples of strings known as \textit{multi-track strings}. Multi-track strings are a generalisation of strings (or \textit{single-track strings}) that have primarily…

Data Structures and Algorithms · Computer Science 2019-12-02 Carl Barton , Ewan Birney , Tomas Fitzgerald

Expected Number of Distinct Subsequences in Randomly Generated Binary Strings

When considering binary strings, it's natural to wonder how many distinct subsequences might exist in a given string. Given that there is an existing algorithm which provides a straightforward way to compute the number of distinct…

Combinatorics · Mathematics 2023-06-22 Yonah Biers-Ariel , Anant Godbole , Elizabeth Kelley

The Shift-Match Number and String Matching Probabilities for Binary Sequences

We define the ``shift-match number'' for a binary string and we compute the probability of occurrence of a given string as a subsequence in longer strings in terms of its shift-match number. We thus prove that the string matching…

Genomics · Quantitative Biology 2007-05-23 A. H. Bilge , A. Erzan , D. Balcan

An Implementation of Nested Pattern Matching in Interaction Nets

Reduction rules in interaction nets are constrained to pattern match exactly one argument at a time. Consequently, a programmer has to introduce auxiliary rules to perform more sophisticated matches. In this paper, we describe the design…

Logic in Computer Science · Computer Science 2010-04-08 Abubakar Hassan , Eugen Jiresch , Shinya Sato

The Number of Distinct Subsequences of a Random Binary String

We determine the average number of distinct subsequences in a random binary string, and derive an estimate for the average number of distinct subsequences of a particular length.

Combinatorics · Mathematics 2013-10-29 Michael J. Collins

A Fixed-Parameter Algorithm for Minimum Common String Partition with Few Duplications

Motivated by the study of genome rearrangements, the NP-hard Minimum Common String Partition problems asks, given two strings, to split both strings into an identical set of blocks. We consider an extension of this problem to unbalanced…

Data Structures and Algorithms · Computer Science 2013-08-02 Laurent Bulteau , Guillaume Fertin , Christian Komusiewicz , Irena Rusu

The Exact String Matching Problem: a Comprehensive Experimental Evaluation

This paper addresses the online exact string matching problem which consists in finding all occurrences of a given pattern p in a text t. It is an extensively studied problem in computer science, mainly due to its direct applications to…

Data Structures and Algorithms · Computer Science 2010-12-14 Simone Faro , Thierry Lecroq

String Cadences

We say a string has a cadence if a certain character is repeated at regular intervals, possibly with intervening occurrences of that character. We call the cadence anchored if the first interval must be the same length as the others. We…

Data Structures and Algorithms · Computer Science 2016-10-12 Amihood Amir , Alberto Apostolico , Travis Gagie , Gad M. Landau

Modulated String Searching

In his 1987 paper entitled "Generalized String Matching", Abrahamson introduced {\em pattern matching with character classes} and provided the first efficient algorithm to solve it. The best known solution to date is due to Linhart and…

Data Structures and Algorithms · Computer Science 2021-01-01 Alberto Apostolico , Péter L. Erdős , István Miklós , Johannes Siemons

Exact Online String Matching Bibliography

In this short note we present a comprehensive bibliography for the online exact string matching problem. The problem consists in finding all occurrences of a given pattern in a text. It is an extensively studied problem in computer science,…

Data Structures and Algorithms · Computer Science 2016-05-18 Simone Faro

Computing the Number of Longest Common Subsequences

This note provides very simple, efficient algorithms for computing the number of distinct longest common subsequences of two input strings and for computing the number of LCS embeddings.

Data Structures and Algorithms · Computer Science 2007-05-23 Ronald I. Greenberg

Optimal Transport-based Alignment of Learned Character Representations for String Similarity

String similarity models are vital for record linkage, entity resolution, and search. In this work, we present STANCE --a learned model for computing the similarity of two strings. Our approach encodes the characters of each string, aligns…

Machine Learning · Computer Science 2019-07-25 Derek Tam , Nicholas Monath , Ari Kobren , Aaron Traylor , Rajarshi Das , Andrew McCallum

Fast and Compact Regular Expression Matching

We study 4 problems in string matching, namely, regular expression matching, approximate regular expression matching, string edit distance, and subsequence indexing, on a standard word RAM model of computation that allows logarithmic-sized…

Data Structures and Algorithms · Computer Science 2008-09-22 Philip Bille , Martin Farach-Colton

Approximate String Matching: Theory and Applications (La Recherche Approch\'ee de Motifs : Th\'eorie et Applications)

The approximate string matching is a fundamental and recurrent problem that arises in most computer science fields. This problem can be defined as follows: Let $D=\{x_1,x_2,\ldots x_d\}$ be a set of $d$ words defined on an alphabet…

Data Structures and Algorithms · Computer Science 2017-01-31 Ibrahim Chegrane

Compressed String Dictionaries

The problem of storing a set of strings --- a string dictionary --- in compact form appears naturally in many cases. While classically it has represented a small part of the whole data to be processed (e.g., for Natural Language processing…

Data Structures and Algorithms · Computer Science 2011-01-31 Nieves R. Brisaboa , Rodrigo Cánovas , Miguel A. Martínez-Prieto , Gonzalo Navarro