Related papers: Efficient Implementation of the Generalized Tunsta…
Trees are useful entities allowing to model data structures and hierarchical relationships in networked decision systems ubiquitously. An ordered tree is a rooted tree where the order of the subtrees (children) of a node is significant. In…
A run in a string is a maximal periodic substring. For example, the string $\texttt{bananatree}$ contains the runs $\texttt{anana} = (\texttt{an})^{3/2}$ and $\texttt{ee} = \texttt{e}^2$. There are less than $n$ runs in any length-$n$…
We present the first solution to $\tau$-majorities on tree paths. Given a tree of $n$ nodes, each with a label from $[1..\sigma]$, and a fixed threshold $0<\tau<1$, such a query gives two nodes $u$ and $v$ and asks for all the labels that…
The authors recently gave an $n^{O(\log\log n)}$ time membership query algorithm for properly learning decision trees under the uniform distribution (Blanc et al., 2021). The previous fastest algorithm for this problem ran in $n^{O(\log…
Suffix trees are key and efficient data structure for solving string problems. A suffix tree is a compressed trie containing all the suffixes of a given text of length $n$ with a linear construction cost. In this work, we introduce an…
The Cover Suffix Tree (CST) of a string $T$ is the suffix tree of $T$ with additional explicit nodes corresponding to halves of square substrings of $T$. In the CST an explicit node corresponding to a substring $C$ of $T$ is annotated with…
The suffix trees are fundamental data structures for various kinds of string processing. The suffix tree of a text string $T$ of length $n$ has $O(n)$ nodes and edges, and the string label of each edge is encoded by a pair of positions in…
Given two rooted, ordered, and labeled trees $P$ and $T$ the tree inclusion problem is to determine if $P$ can be obtained from $T$ by deleting nodes in $T$. This problem has recently been recognized as an important query primitive in XML…
The degree distribution of an ordered tree $T$ with $n$ nodes is $\vec{n} = (n_0,\ldots,n_{n-1})$, where $n_i$ is the number of nodes in $T$ with $i$ children. Let $\mathcal{N}(\vec{n})$ be the number of trees with degree distribution…
Probabilistic programming languages and other machine learning applications often require samples to be generated from a categorical distribution where the probability of each one of $n$ categories is specified as a parameter. If the…
Assuming Zipf's Law to be accurate, we show that existing techniques for partially optimizing binary trees produce results that are approximately 10% worse than true optimal. We present a new approximate optimization technique that runs in…
In this paper we propose a dynamic data structure that supports efficient algorithms for updating and querying singly connected Bayesian networks (causal trees and polytrees). In the conventional algorithms, new evidence in absorbed in time…
We suggest a new non-recursive algorithm for constructing a binary search tree given an array of numbers. The algorithm has $O(N)$ time and $O(1)$ memory complexity if the given array of $N$ numbers is sorted. The resulting tree is of…
Net-trees are a general purpose data structure for metric data that have been used to solve a wide range of algorithmic problems. We give a simple randomized algorithm to construct net-trees on doubling metrics using $O(n\log n)$ time in…
We consider the problem of uniformly generating a spanning tree, of a connected undirected graph. This process is useful to compute statistics, namely for phylogenetic trees. We describe a Markov chain for producing these trees. For cycle…
We give an $n^{O(\log\log n)}$-time membership query algorithm for properly and agnostically learning decision trees under the uniform distribution over $\{\pm 1\}^n$. Even in the realizable setting, the previous fastest runtime was…
We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in $\tilde{O}(n^{4/3}m^{1/2}+n^{2})$ time (The $\tilde{O}(\cdot)$ notation hides $\operatorname{polylog}(n)$…
Karloff? and Shirley recently proposed summary trees as a new way to visualize large rooted trees (Eurovis 2013) and gave algorithms for generating a maximum-entropy k-node summary tree of an input n-node rooted tree. However, the algorithm…
An L(2,1)-labeling of a graph $G$ is an assignment $f$ from the vertex set $V(G)$ to the set of nonnegative integers such that $|f(x)-f(y)|\ge 2$ if $x$ and $y$ are adjacent and $|f(x)-f(y)|\ge 1$ if $x$ and $y$ are at distance 2, for all…
The problem of constructing optimal factoring automata arises in the context of unification factoring for the efficient execution of logic programs. Given an ordered set of $n$ strings of length $m$, the problem is to construct a trie-like…