Related papers: Mining Patterns with a Balanced Interval
Sequential sampling occurs when the entire population is not known in advance and data are obtained one at a time or in groups of units. This manuscript proposes a new algorithm to sequentially select a balanced sample. The algorithm…
This work addresses the problem of assigning periodic tasks to workers in a balanced way, i.e., so that each worker performs every task with the same frequency over the long term. The input consists of a list of tasks to be repeated weekly…
Mobile phone data -- with file sizes scaling into terabytes -- easily overwhelm the computational capacity available to some researchers. Moreover, for ethical reasons, data access is often granted only to particular subsets, restricting…
A prediction interval covers a future observation from a random process in repeated sampling, and is typically constructed by identifying a pivotal quantity that is also an ancillary statistic. Analogously, a tolerance interval covers a…
Given a log and a specification, timed pattern matching aims at exhibiting for which start and end dates a specification holds on that log. For example, "a given action is always followed by another action before a given deadline". This…
We study the frequentist properties of confidence intervals computed by the method known to statisticians as the Profile Likelihood. It is seen that the coverage of these intervals is surprisingly good over a wide range of possible…
The ordinal patterns of a fixed number of consecutive values in a time series is the spatial ordering of these values. Counting how often a specific ordinal pattern occurs in a time series provides important insights into the properties of…
As AI systems develop in complexity it is becoming increasingly hard to ensure non-discrimination on the basis of protected attributes such as gender, age, and race. Many recent methods have been developed for dealing with this issue as…
Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and…
We propose a new measure of support (the number of occur- rences of a pattern), in which instances are more important if they occur with a certain frequency and close after each other in the stream of trans- actions. We will explain this…
Many organisations manage service quality and monitor a large set devices and servers where each entity is associated with telemetry or physical sensor data series. Recently, various methods have been proposed to detect behavioural…
The chronicle of prime numbers travel back thousands of years in human history. Not only the traits of prime numbers have surprised people, but also all those endeavors made for ages to find a pattern in the appearance of prime numbers has…
The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact…
As one of the most commonly seen data challenges, missing data, in particular, multiple, non-monotone missing patterns, complicates estimation and inference due to the fact that missingness mechanisms are often not missing at random, and…
A Transaction database contains a set of transactions along with items and their associated timestamps. Transitional patterns are the patterns which specify the dynamic behavior of frequent patterns in a transaction database. To discover…
We say that a diagonal in an array is {\em $\lambda$-balanced} if each entry occurs $\lambda$ times. Let $L$ be a frequency square of type $F(n;\lambda^m)$; that is, an $n\times n$ array in which each entry from $\{1,2,\dots ,m\}$ occurs…
Whereas confidence intervals are used to assess uncertainty due to unmeasured individuals, confounding intervals can be used to assess uncertainty due to unmeasured attributes. Previously, we have introduced a methodology for computing…
We say a string has a cadence if a certain character is repeated at regular intervals, possibly with intervening occurrences of that character. We call the cadence anchored if the first interval must be the same length as the others. We…
We consider the discrepancy problem of coloring $n$ intervals with $k$ colors such that at each point on the line, the maximal difference between the number of intervals of any two colors is minimal. Somewhat surprisingly, a coloring with…
Mining frequent itemsets is a popular method for finding associated items in databases. For this method, support, the co-occurrence frequency of the items which form an association, is used as the primary indicator of the associations's…