Related papers: Improving Seek Time for Column Store Using MMH Alg…

Fast and Scalable Memristive In-Memory Sorting with Column-Skipping Algorithm

Memristive in-memory sorting has been proposed recently to improve hardware sorting efficiency. Using iterative in-memory min computations, data movements between memory and external processing units can be eliminated for improved latency…

Hardware Architecture · Computer Science 2022-02-22 Lianfeng Yu , Zhaokun Jing , Yuchao Yang , Yaoyu Tao

Hybrid Materialization in a Disk-Based Column-Store

In column-oriented query processing, a materialization strategy determines when lightweight positions (row IDs) are translated into tuples. It is an important part of column-store architecture, since it defines the class of supported query…

Databases · Computer Science 2023-04-19 Evgeniy Klyuchikov , Elena Mikhailova , George Chernishev

Learning Hash Functions Using Column Generation

Fast nearest neighbor searching is becoming an increasingly important tool in solving many large-scale problems. Recently a number of approaches to learning data-dependent hash functions have been developed. In this work, we propose a…

Machine Learning · Computer Science 2013-03-05 Xi Li , Guosheng Lin , Chunhua Shen , Anton van den Hengel , Anthony Dick

High Throughput Push Based Storage Manager

The storage manager, as a key component of the database system, is responsible for organizing, reading, and delivering data to the execution engine for processing. According to the data serving mechanism, existing storage managers are…

Databases · Computer Science 2019-05-20 Ye Zhu

Revisiting Data Compression in Column-Stores

Data compression is widely used in contemporary column-oriented DBMSes to lower space usage and to speed up query processing. Pioneering systems have introduced compression to tackle the disk bandwidth bottleneck by trading CPU processing…

Databases · Computer Science 2021-05-20 Alexander Slesarev , Evgeniy Klyuchikov , Kirill Smirnov , George Chernishev

Structured Learning of Binary Codes with Column Generation

Hashing methods aim to learn a set of hash functions which map the original features to compact binary codes with similarity preserving in the Hamming space. Hashing has proven a valuable tool for large-scale information retrieval. We…

Machine Learning · Computer Science 2016-02-23 Guosheng Lin , Fayao Liu , Chunhua Shen , Jianxin Wu , Heng Tao Shen

Efficient hybrid search algorithm on ordered datasets

The increase in the rate of data is much higher than the increase in the speed of computers, which results in a heavy emphasis on search algorithms in research literature. Searching an item in ordered list is an efficient operation in data…

Data Structures and Algorithms · Computer Science 2017-08-04 Adnan Saher Mohammed , Şahin Emrah Amrahov , Fatih V. Çelebi

A Genetic Algorithm for Obtaining Memory Constrained Near-Perfect Hashing

The problem of fast items retrieval from a fixed collection is often encountered in most computer science areas, from operating system components to databases and user interfaces. We present an approach based on hash tables that focuses on…

Neural and Evolutionary Computing · Computer Science 2020-07-17 Dan Domnita , Ciprian Oprisa

Boosting Multi-Core Reachability Performance with Shared Hash Tables

This paper focuses on data structures for multi-core reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-05-06 Alfons Laarman , Jaco van de Pol , Michael Weber

Finding a Second Wind: Speeding Up Graph Traversal Queries in RDBMSs Using Column-Oriented Processing

Recursive queries and recursive derived tables constitute an important part of the SQL standard. Their efficient processing is important for many real-life applications that rely on graph or hierarchy traversal. Position-enabled…

Databases · Computer Science 2023-08-21 Mikhail Firsov , Michael Polyntsov , Kirill Smirnov , George Chernishev

Scalable Locality-Sensitive Hashing for Similarity Search in High-Dimensional, Large-Scale Multimedia Datasets

Similarity search is critical for many database applications, including the increasingly popular online services for Content-Based Multimedia Retrieval (CBMR). These services, which include image search engines, must handle an overwhelming…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-16 Thiago S. F. X. Teixeira , George Teodoro , Eduardo Valle , Joel H. Saltz

Deep Memory Search: A Metaheuristic Approach for Optimizing Heuristic Search

Metaheuristic search methods have proven to be essential tools for tackling complex optimization challenges, but their full potential is often constrained by conventional algorithmic frameworks. In this paper, we introduce a novel approach…

Artificial Intelligence · Computer Science 2024-10-23 Abdel-Rahman Hedar , Alaa E. Abdel-Hakim , Wael Deabes , Youseef Alotaibi , Kheir Eddine Bouazza

Evaluating Memento Service Optimizations

Services and applications based on the Memento Aggregator can suffer from slow response times due to the federated search across web archives performed by the Memento infrastructure. In an effort to decrease the response times, we…

Information Retrieval · Computer Science 2019-06-04 Martin Klein , Lyudmila Balakireva , Harihar Shankar

Processing a Trillion Cells per Mouse Click

Column-oriented database systems have been a real game changer for the industry in recent years. Highly tuned and performant systems have evolved that provide users with the possibility of answering ad hoc queries over large datasets in an…

Databases · Computer Science 2012-08-02 Alexander Hall , Olaf Bachmann , Robert Büssow , Silviu Gănceanu , Marc Nunkesser

A Storage Advisor for Hybrid-Store Databases

With the SAP HANA database, SAP offers a high-performance in-memory hybrid-store database. Hybrid-store databases---that is, databases supporting row- and column-oriented data management---are getting more and more prominent. While the…

Databases · Computer Science 2012-08-22 Philipp Rösch , Lars Dannecker , Gregor Hackenbroich , Franz Faerber

Peformance Prediction for Coarse-Grained Locking: MCS Case

A standard design pattern found in many concurrent data structures, such as hash tables or ordered containers, is alternation of parallelizable sections that incur no data conflicts and critical sections that must run sequentially and are…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-13 Vitaly Aksenov , Daniil Bolotov , Petr Kuznetsov

Skip Hash: A Fast Ordered Map Via Software Transactional Memory

Scalable ordered maps must ensure that range queries, which operate over many consecutive keys, provide intuitive semantics (e.g., linearizability) without degrading the performance of concurrent insertions and removals. These goals are…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-11 Matthew Rodriguez , Vitaly Aksenov , Michael Spear

Column-Oriented Storage Techniques for MapReduce

Users of MapReduce often run into performance problems when they scale up their workloads. Many of the problems they encounter can be overcome by applying techniques learned from over three decades of research on parallel DBMSs. However,…

Databases · Computer Science 2011-05-24 Avrilia Floratou , Jignesh Patel , Eugene Shekita , Sandeep Tata

Perfect Hashing for Data Management Applications

Perfect hash functions can potentially be used to compress data in connection with a variety of data management tasks. Though there has been considerable work on how to construct good perfect hash functions, there is a gap between theory…

Data Structures and Algorithms · Computer Science 2007-05-23 Fabiano C. Botelho , Rasmus Pagh , Nivio Ziviani

MementoHash: A Stateful, Minimal Memory, Best Performing Consistent Hash Algorithm

Consistent hashing is used in distributed systems and networking applications to spread data evenly and efficiently across a cluster of nodes. In this paper, we present MementoHash, a novel consistent hashing algorithm that eliminates known…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-28 Massimo Coluzzi , Amos Brocco , Alessandro Antonucci , Tiziano Leidi