Related papers: The Random Buffer Tree : A Randomized Technique fo…
In a variety of applications, we need to keep track of the development of a data set over time. For maintaining and querying this multi version data I/O-efficiently, external memory data structures are required. In this paper, we present a…
Containment-based trees encompass various handy structures such as B+-trees, R-trees and M-trees. They are widely used to build data indexes, range-queryable overlays, publish/subscribe systems both in centralized and distributed contexts.…
A data structure, called a biased range tree, is presented that preprocesses a set S of n points in R^2 and a query distribution D for 2-sided orthogonal range counting queries. The expected query time for this data structure, when queries…
We present an optimal partially-persistent external-memory search tree with amortized I/O bounds matching those achieved by the non-persistent $B^{\varepsilon}$-tree by Brodal and Fagerberg [SODA 2003]. In a partially-persistent data…
Neural Networks and Decision Trees: two popular techniques for supervised learning that are seemingly disconnected in their formulation and optimization method, have recently been combined in a single construct. The connection pivots on…
This paper presents a new kind of self-balancing ternary search trie that uses a randomized balancing strategy adapted from Aragon and Seidel's randomized binary search trees ("treaps"). After any sequence of insertions and deletions of…
Nowadays, multiprocessing is mainstream with exponentially increasing number of processors. Load balancing is, therefore, a critical operation for the efficient execution of parallel algorithms. In this paper we consider the fundamental…
The use of machine learning algorithms in finance, medicine, and criminal justice can deeply impact human lives. As a consequence, research into interpretable machine learning has rapidly grown in an attempt to better control and fix…
Tree-based models have proven to be an effective solution for web ranking as well as other problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, given an…
The Binary Search Tree (BST) is average in computer science which supports a compact data structure in memory and oneself even conducts a row of quick algorithms, by which people often apply it in dynamical circumstance. Besides these…
A treap is a classic randomized binary search tree data structure that is easy to implement and supports O(\log n) expected time access. However, classic treaps do not take advantage of the input distribution or patterns in the input. Given…
This paper presents a detailed comparison of a recently proposed algorithm for optimizing decision trees, tree alternating optimization (TAO), with other popular, established algorithms. We compare their performance on a number of…
Tree search algorithms, such as branch-and-bound, are the most widely used tools for solving combinatorial and nonconvex problems. For example, they are the foremost method for solving (mixed) integer programs and constraint satisfaction…
We show how to construct a dynamic ordered dictionary, supporting insert/delete/rank/select on a set of $n$ elements from a universe of size $U$, that achieves the optimal amortized expected time complexity of $O(1 + \log n / \log \log U)$,…
Random Forests are one of the most popular classifiers in machine learning. The larger they are, the more precise is the outcome of their predictions. However, this comes at a cost: their running time for classification grows linearly with…
Net-trees are a general purpose data structure for metric data that have been used to solve a wide range of algorithmic problems. We give a simple randomized algorithm to construct net-trees on doubling metrics using $O(n\log n)$ time in…
Efficient resource allocation is a key challenge in modern cloud computing. Over-provisioning leads to unnecessary costs, while under-provisioning risks performance degradation and SLA violations. This work presents an artificial…
The performance of classification algorithms with a massive and highly imbalanced data stream depends upon efficient balancing strategy. Some techniques of balancing strategy have been applied in the past with Batch data to resolve the…
We consider the problem of automatically proving resource bounds. That is, we study how to prove that an integer-valued resource variable is bounded by a given program expression. Automatic resource-bound analysis has recently received…
Compact and I/O-efficient data representations play an important role in efficient algorithm design, as memory bandwidth and latency can present a significant performance bottleneck, slowing the computation by orders of magnitude. While…