Related papers: BinomialHash: A Constant Time, Minimal Memory Cons…
Consistent range-hashing is a technique used in distributed systems, either directly or as a subroutine for consistent hashing, commonly to realize an even and stable data distribution over a variable number of resources. We introduce…
Consistent hashing is a technique that can minimize key remapping when the number of hash buckets changes. The paper proposes a fast consistent hash algorithm (called power consistent hash) that has $O(1)$ expected time for key lookup,…
Consistent hashing is used in distributed systems and networking applications to spread data evenly and efficiently across a cluster of nodes. In this paper, we present MementoHash, a novel consistent hashing algorithm that eliminates known…
Consistent hashing (CH) has been pivotal as a data router and load balancer in diverse fields, including distributed databases, cloud infrastructure, and peer-to-peer networks. However, existing CH algorithms often fall short in…
Consistent Hashing functions are widely used for load balancing across a variety of applications. However, the original presentation and typical implementations of Consistent Hashing rely on randomised allocation of hash codes to keys which…
We present jump consistent hash, a fast, minimal memory, consistent hash algorithm that can be expressed in about 5 lines of code. In comparison to the algorithm of Karger et al., jump consistent hash requires no storage, is faster, and…
Consistent hashing (CH) is a central building block in many networking applications, from datacenter load-balancing to distributed storage. Unfortunately, state-of-the-art CH solutions cannot ensure full consistency under arbitrary changes…
We describe a consistent hashing algorithm which performs multiple lookups per key in a hash table of nodes. It requires no additional storage beyond the hash table, and achieves a peak-to-average load ratio of 1 + epsilon with just 1 +…
Distributed systems often serve dynamic workloads and resource demands evolve over time. Such a temporal behavior stands in contrast to the static and demand-oblivious nature of most data structures used by these systems. In this paper, we…
Most cloud services and distributed applications rely on hashing algorithms that allow dynamic scaling of a robust and efficient hash table. Examples include AWS, Google Cloud and BitTorrent. Consistent and rendezvous hashing are algorithms…
In recent years, the distinctive advancement of handling huge data promotes the evolution of ubiquitous computing and analysis technologies. With the constantly upward system burden and computational complexity, adaptive coding has been a…
This paper proposes round-hashing, which is suitable for data storage on distributed servers and for implementing external-memory tables in which each lookup retrieves at most a single block of external memory, using a stash. For data…
Minimal perfect hash functions provide space-efficient and collision-free hashing on static sets. Existing algorithms and implementations that build such functions have practical limitations on the number of input elements they can process,…
Persistent homology is a popular and powerful tool for capturing topological features of data. Advances in algorithms for computing persistent homology have reduced the computation time drastically -- as long as the algorithm does not…
Dynamic load balancing lies at the heart of distributed caching. Here, the goal is to assign objects (load) to servers (computing nodes) in a way that provides load balancing while at the same time dynamically adjusts to the addition or…
The problem of fast items retrieval from a fixed collection is often encountered in most computer science areas, from operating system components to databases and user interfaces. We present an approach based on hash tables that focuses on…
Introduction. Distributed data processing and storage systems require efficient methods to distribute keys across buckets. While simple and fast, the traditional modulo-based mapping is unstable when the number of buckets changes, leading…
Persistent (co)homology is a central construction in topological data analysis, where it is used to quantify prominence of features in data to produce stable descriptors suitable for downstream analysis. Persistence is challenging to…
HalftimeHash is a new algorithm for hashing long strings. The goals are few collisions (different inputs that produce identical output hash values) and high performance. Compared to the fastest universal hash functions on long strings…
In the realm of big data and cloud computing, distributed systems are tasked with proficiently managing, storing, and validating extensive datasets across numerous nodes, all while maintaining robust data integrity. Conventional hashing…