Related papers: Generalizations of the Goulden-Jackson Cluster Met…
The powerful (and so far under-utilized) Goulden-Jackson Cluster method for finding the generating function for the number of words avoiding, as factors, the members of a prescribed set of `dirty words', is tutorialized and extended in…
The Goulden-Jackson cluster method is a powerful tool for obtaining generating functions for counting words in a free monoid by occurrences of a set of subwords. We introduce a generalization of the cluster method for monoid networks, which…
We find generating functions for the number of words avoiding certain patterns or sets of patterns on at most 2 distinct letters and determine which of them are equally avoided. We also find the exact number of words avoiding certain…
The Goulden-Jackson cluster method is a powerful tool for counting words by occurrences of prescribed subwords, and was adapted by Elizalde and Noy for counting permutations by occurrences of prescribed consecutive patterns. In this paper,…
We find exact formulas and/or generating functions for the number of words avoiding 3-letter generalized multipermutation patterns and find which of them are equally avoided.
In a recent article a generalization of the binomial distribution associated with a sequence of positive numbers was examined. The analysis of the nonnegativeness of the formal expressions was a key-point to allow to give them a statistical…
We find generating functions the number of strings (words) containing a specified number of occurrences of certain types of order-isomorphic classes of substrings called subword patterns. In particular, we find generating functions for the…
We develop a method for counting words subject to various restrictions by finding a combinatorial interpretation for a product of weighted sums of Laguerre polynomials with parameter \alpha = -1. We describe how such a series can be…
We apply ideas from the cluster method to q-count the permutations of a multiset according to the number of occurrences of certain generalized patterns, as defined by Babson and Steingrimsson. In particular, we consider those patterns with…
We continue to consider the ordered lexicographic sequence, which is constructed according to the formal characteristics of a series of natural numbers. For analysis, we selected balanced parentheses with zeros, Motzkin words. As you know,…
We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the…
The paper introduces the concept of a cluster structure to define a joint distribution of the sample size and its exchangeable random partitions. The cluster structure allows the probability distribution of the random partitions of a subset…
We present {\em generative clustering} (GC) for clustering a set of documents, $\mathrm{X}$, by using texts $\mathrm{Y}$ generated by large language models (LLMs) instead of by clustering the original documents $\mathrm{X}$. Because LLMs…
Recently, Babson and Steingrimsson (see [BS]) introduced generalized permutations patterns that allow the requirement that two adjacent letters in a pattern must be adjacent in the permutation. We study generating functions for the number…
We construct generating trees with one, two, and three labels for some classes of permutations avoiding generalized patterns of length 3 and 4. These trees are built by adding at each level an entry to the right end of the permutation,…
This paper presents a new Bayesian non-parametric model by extending the usage of Hierarchical Dirichlet Allocation to extract tree structured word clusters from text data. The inference algorithm of the model collects words in a cluster if…
An unconstrained crossword puzzle is a generalization of the constrained crossword problem. In this problem, only the word vocabulary, and optionally the grid dimensions are known. Hence, it not only requires the algorithm to determine the…
Goulden and Jackson introduced a very powerful method to study the distributions of certain consecutive patterns in permutations, words, and other combinatorial objects which is now called the cluster method. There are a number of natural…
We present a new recursive generation algorithm for prefix normal words. These are binary strings with the property that no substring has more 1s than the prefix of the same length. The new algorithm uses two operations on binary strings,…
Consider the situation where a word is chosen probabilistically from a finite list. If an attacker knows the list and can inquire about each word in turn, then selecting the word via the uniform distribution maximizes the attacker's…