Related papers: Hippo: A Fast, yet Scalable, Database Indexing App…
Database management systems (DBMSs) carefully optimize complex multi-join queries to avoid expensive disk I/O. As servers today feature tens or hundreds of gigabytes of RAM, a significant fraction of many analytic databases becomes…
Indexes are the best apposite choice for quickly retrieving the records. This is nothing but cutting down the number of Disk IO. Instead of scanning the complete table for the results, we can decrease the number of IO's or page fetches…
Index structures are one of the most important tools that DBAs leverage to improve the performance of analytics and transactional workloads. However, building several indexes over large datasets can often become prohibitive and consume…
It is commonly accepted in the practice of on-line analytical processing of databases that the multidimensional database organization is less scalable than the relational one. It is easy to see that the size of the multidimensional…
Hive is the most mature and prevalent data warehouse tool providing SQL-like interface in the Hadoop ecosystem. It is successfully used in many Internet companies and shows its value for big data processing in traditional industries.…
Indexes are critical for efficient data retrieval and updates in modern databases. Recent advances in machine learning have led to the development of learned indexes, which model the cumulative distribution function of data to predict…
Indexes can significantly improve search performance in relational databases. However, if the query workload changes frequently or new data updates occur continuously, it may not be worthwhile to build a conventional index upfront for query…
With the prevalence of online platforms, today, data is being generated and accessed by users at a very high rate. Besides, applications such as stock trading or high frequency trading require guaranteed low delays for performing an…
Although many updatable learned indexes have been proposed in recent years, whether they can outperform traditional approaches on disk remains unknown. In this study, we revisit and implement four state-of-the-art updatable learned indexes…
While in-memory learned indexes have shown promising performance as compared to B+-tree, most widely used databases in real applications still rely on disk-based operations. Based on our experiments, we observe that directly applying the…
Modern mainstream persistent key-value storage engines utilize Log-Structured Merge tree (LSM-tree) based designs, optimizing read/write performance by leveraging sequential disk I/O. However, the advent of SSDs, with their significant…
One of the distinctive features of Information Retrieval systems comparing to Database Management systems, is that they offer better compression for posting lists, resulting in better I/O performance and thus faster query evaluation. In…
Modern business applications and scientific databases call for inherently dynamic data storage environments. Such environments are characterized by two challenging features: (a) they have little idle system time to devote on physical…
Hash tables are an essential data-structure for numerous networking applications (e.g., connection tracking, firewalls, network address translators). Among these, cuckoo hash tables provide excellent performance by allowing lookups to be…
Previous research addressed the potential problems of the hard-disk oriented design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential benefits of flashSSDs. First, we examine the internal parallelism issues of flashSSDs…
Index access is one of the dominant performance factors in transactional database systems. Many systems use a B+-tree or one of its variants to handle point and range operations. This access pattern has room for performance improvement.…
Index plays an essential role in modern database engines to accelerate the query processing. The new paradigm of "learned index" has significantly changed the way of designing index structures in DBMS. The key insight is that indexes could…
In-memory data management systems, such as key-value stores, have become an essential infrastructure in today's big-data processing and cloud computing. They rely on efficient index structures to access data. While unordered indexes, such…
Multidimensional data are becoming more prevalent, partly due to the rise of the Internet of Things (IoT), and with that the need to ingest and analyze data streams at rates higher than before. Some industrial IoT applications require…
In database systems, joins are often expensive despite many years of research producing numerous join algorithms. Precomputed and materialized join views deliver the best query performance, whereas traditional indexes, used as pre-sorted…