English

A High Throughput Parallel Hash Table on FPGA using XOR-based Memory

Distributed, Parallel, and Cluster Computing 2021-08-24 v2

Abstract

Hash table is a fundamental data structure for quick search and retrieval of data. It is a key component in complex graph analytics and AI/ML applications. State-of-the-art parallel hash table implementations either make some simplifying assumptions such as supporting only a subset of hash table operations or employ optimizations that lead to performance that is highly data dependent and in the worst case can be similar to a sequential implementation. In contrast, in this work we develop a dynamic hash table that supports all the hash table queries - search, insert, delete, update, while allowing us to support 'p' parallel queries (p>1) per clock cycle via p processing engines (PEs) in the worst case i.e. the performance is data agnostic. We achieve this by implementing novel XOR based multi-ported block memories on FPGAs. Additionally, we develop a technique to optimize the memory requirement of the hash table if the ratio of search to insert/update/delete queries is known beforehand. We implement our design on state-of-the-art FPGA devices. Our design is scalable to 16 PEs and supports throughput up to 5926 MOPS. It matches the throughput of the state-of-the-art hash table design - FASTHash, which only supports search and insert operations. Comparing with the best FPGA design that supports the same set of operations, our hash table achieves up to 12.3x speedup.

Keywords

Cite

@article{arxiv.2108.03390,
  title  = {A High Throughput Parallel Hash Table on FPGA using XOR-based Memory},
  author = {Ruizhi Zhang and Sasindu Wijeratne and Yang Yang and Sanmukh R. Kuppannagari and Viktor K. Prasanna},
  journal= {arXiv preprint arXiv:2108.03390},
  year   = {2021}
}

Comments

2020 IEEE High Performance Extreme Computing Conference (HPEC)