Related papers: Fast Set Intersection and Two Patterns Matching
Set intersection is a fundamental operation in information retrieval and database systems. This paper introduces linear space data structures to represent sets such that their intersection can be computed in a worst-case efficient way. In…
We study data structure problems related to document indexing and pattern matching queries and our main contribution is to show that the pointer machine model of computation can be extremely useful in proving high and unconditional lower…
We study a new variant of the string matching problem called cross-document string matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern…
A variant of the well-known Set Covering Problem is studied in this paper, where subsets of a collection have to be selected, and pairwise conflicts among subsets of items exist. The selection of each subset has a cost, and the inclusion of…
A hypergraph is a generalization of a graph, in which a hyperedge can connect multiple vertices, modeling complex relationships involving multiple vertices simultaneously. Hypergraph pattern matching, which is to find all isomorphic…
Fast matching of regular expressions with bounded repetition, aka counting, such as (ab){50,100}, i.e., matching linear in the length of the text and independent of the repetition bounds, has been an open problem for at least two decades.…
In this paper, we revisit the much studied problem of Pattern Matching with Swaps (Swap Matching problem, for short). We first present a graph-theoretic model, which opens a new and so far unexplored avenue to solve the problem. Then, using…
Consider a family of sets and a single set, called the query set. How can one quickly find a member of the family which has a maximal intersection with the query set? Time constraints on the query and on a possible preprocessing of the set…
Set intersection is the core in a variety of problems, e.g. frequent itemset mining and sparse boolean matrix multiplication. It is well-known that large speed gains can, for some computational problems, be obtained by using a graphics…
Given a pattern x of length m and a text y of length n, both over an ordered alphabet, the order-preserving pattern matching problem consists in finding all substrings of the text with the same relative order as the pattern. It is an…
We show how to represent sets in a linear space data structure such that expressions involving unions and intersections of sets can be computed in a worst-case efficient way. This problem has applications in e.g. information retrieval and…
Term pattern matching is the problem of finding all pattern matches in a subject term, given a set of patterns. Finding efficient algorithms for this problem is an important direction for research [19]. We present a new set automaton…
This paper presents a new combinatorial optimisation task, the Subset Sum Matching Problem (SSMP), which is an abstraction of common financial applications such as trades reconciliation. We present three algorithms, two suboptimal and one…
We describe the recently introduced extremal optimization algorithm and apply it to target detection and association problems arising in pre-processing for multi-target tracking. Here we consider the problem of pre-processing for multiple…
Compression of inverted lists with methods that support fast intersection operations is an active research topic. Most compression schemes rely on encoding differences between consecutive positions with techniques that favor small numbers.…
The pattern matching problem with swaps is to find all occurrences of a pattern in a text while allowing the pattern to swap adjacent symbols. The goal is to design fast matching algorithm that takes advantage of the bit parallelism of…
In the SetDisjointness problem, a collection of $m$ sets $S_1,S_2,...,S_m$ from some universe $U$ is preprocessed in order to answer queries on the emptiness of the intersection of some two query sets from the collection. In the…
The classic string indexing problem is to preprocess a string S into a compact data structure that supports efficient pattern matching queries. Typical queries include existential queries (decide if the pattern occurs in S), reporting…
The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact…
We introduce a new combinatorial structure: the superselector. We show that superselectors subsume several important combinatorial structures used in the past few years to solve problems in group testing, compressed sensing, multi-channel…