Related papers: PushdownDB: Accelerating a DBMS using S3 Computati…

Enhancing Computation Pushdown for Cloud OLAP Databases

Network is a major bottleneck in modern cloud databases that adopt a storage-disaggregation architecture. Computation pushdown is a promising solution to tackle this issue, which offloads some computation tasks to the storage layer to…

Databases · Computer Science 2024-01-02 Yifei Yang , Xiangyao Yu , Marco Serafini , Ashraf Aboulnaga , Michael Stonebraker

Novel Selectivity Estimation Strategy for Modern DBMS

Selectivity estimation is important in query optimization, however accurate estimation is difficult when predicates are complex. Instead of existing database synopses and statistics not helpful for such cases, we introduce a new approach to…

Databases · Computer Science 2018-06-25 Jun Hyung Shin

Starling: A Scalable Query Engine on Cloud Function Services

Much like on-premises systems, the natural choice for running database analytics workloads in the cloud is to provision a cluster of nodes to run a database instance. However, analytics workloads are often bursty or low volume, leaving…

Databases · Computer Science 2019-11-27 Matthew Perron , Raul Castro Fernandez , David DeWitt , Samuel Madden

SharedDB: Killing One Thousand Queries With One Stone

Traditional database systems are built around the query-at-a-time model. This approach tries to optimize performance in a best-effort way. Unfortunately, best effort is not good enough for many modern applications. These applications…

Databases · Computer Science 2012-03-02 Georgios Giannikis , Gustavo Alonso , Donald Kossmann

Push Down Optimization for Distributed Multi Cloud Data Integration

Enterprises increasingly adopt multi cloud architectures to take advantage of diverse database engines, regional availability, and cost models. In these environments, ETL pipelines must process large, distributed datasets while minimizing…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-27 Ravi Kiran Kodali , Vinoth Punniyamoorthy , Akash Kumar Agarwal , Bikesh Kumar , Balakrishna Pothineni , Aswathnarayan Muthukrishnan Kirubakaran , Sumit Saha , Nachiappan Chockalingam

Does query performance optimization lead to energy efficiency? A comparative analysis of energy efficiency of database operations under different workload scenarios

With the continuous increase of online services as well as energy costs, energy consumption becomes a significant cost factor for the evaluation of data center operations. A significant contributor to that is the performance of database…

Databases · Computer Science 2013-03-21 Raik Niemann , Nikolaos Korfiatis , Roberto Zicari , Richard Göbel

Disaggregated Database Management Systems

Modern applications demand high performance and cost efficient database management systems (DBMSs). Their workloads may be diverse, ranging from online transaction processing to analytics and decision support. The cloud infrastructure…

Databases · Computer Science 2024-11-05 Shahram Ghandeharizadeh , Philip A. Bernstein , Dhruba Borthakur , Haoyu Huang , Jai Menon , Sumit Puri

Faster Algorithms for Weighted Recursive State Machines

Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b)…

Programming Languages · Computer Science 2020-01-13 Krishnendu Chatterjee , Bernhard Kragl , Samarth Mishra , Andreas Pavlogiannis

P4DB -- The Case for In-Network OLTP (Extended Technical Report)

In this paper we present a new approach for distributed DBMSs called P4DB, that uses a programmable switch to accelerate OLTP workloads. The main idea of P4DB is that it implements a transaction processing engine on top of a P4-programmable…

Databases · Computer Science 2022-06-02 Matthias Jasny , Lasse Thostrup , Tobias Ziegler , Carsten Binnig

ArcaDB: A Container-based Disaggregated Query Engine for Heterogenous Computational Environments

Modern enterprises rely on data management systems to collect, store, and analyze vast amounts of data related with their operations. Nowadays, clusters and hardware accelerators (e.g., GPUs, TPUs) have become a necessity to scale with the…

Databases · Computer Science 2023-11-28 Kristalys Ruiz-Rohena , Manuel Rodriguez-Martinez

Efficient Compactions Between Storage Tiers with PrismDB

In recent years, emerging storage hardware technologies have focused on divergent goals: better performance or lower cost-per-bit. Correspondingly, data systems that employ these technologies are typically optimized either to be fast (but…

Databases · Computer Science 2022-05-27 Ashwini Raina , Jianan Lu , Asaf Cidon , Michael J. Freedman

Cache-based Multi-query Optimization for Data-intensive Scalable Computing Frameworks

In modern large-scale distributed systems, analytics jobs submitted by various users often share similar work, for example scanning and processing the same subset of data. Instead of optimizing jobs independently, which may result in…

Databases · Computer Science 2018-05-23 Pietro Michiardi , Damiano Carra , Sara Migliorini

Predictive Indexing

There has been considerable research on automated index tuning in database management systems (DBMSs). But the majority of these solutions tune the index configuration by retrospectively making computationally expensive physical design…

Databases · Computer Science 2019-01-23 Joy Arulraj , Ran Xian , Lin Ma , Andrew Pavlo

FDB: A Query Engine for Factorised Relational Databases

Factorised databases are relational databases that use compact factorised representations at the physical layer to reduce data redundancy and boost query performance. This paper introduces FDB, an in-memory query engine for…

Databases · Computer Science 2012-03-14 Nurzhan Bakibayev , Dan Olteanu , Jakub Závodný

Catapults to the Rescue: Accelerating Vector Search by Exploiting Query Locality

Graph-based indexing is the dominant approach for approximate nearest neighbor search in vector databases, offering high recall with low latency across billions of vectors. However, in such indices, the edge set of the proximity graph is…

Databases · Computer Science 2026-03-03 Sami Abuzakuk , Anne-Marie Kermarrec , Rafael Pires , Mathis Randl , Martijn de Vos

Scalable Neural Data Server: A Data Recommender for Transfer Learning

Absence of large-scale labeled data in the practitioner's target domain can be a bottleneck to applying machine learning algorithms in practice. Transfer learning is a popular strategy for leveraging additional data to improve the…

Machine Learning · Computer Science 2022-06-22 Tianshi Cao , Sasha Doubov , David Acuna , Sanja Fidler

Why Did My Query Slow Down?

Many enterprise environments have databases running on network-attached server-storage infrastructure (referred to as Storage Area Networks or SANs). Both the database and the SAN are complex systems that need their own separate…

Databases · Computer Science 2011-10-25 Nedyalko Borisov , Shivnath Babu , Sandeep Uttamchandani , Ramani Routray , Aameek Singh

Compression Aware Physical Database Design

Modern RDBMSs support the ability to compress data using methods such as null suppression and dictionary encoding. Data compression offers the promise of significantly reducing storage requirements and improving I/O performance for decision…

Databases · Computer Science 2011-09-06 Hideaki Kimura , Vivek Narasayya , Manoj Syamala

The Tensor Data Platform: Towards an AI-centric Database System

Database engines have historically absorbed many of the innovations in data processing, adding features to process graph data, XML, object oriented, and text among many others. In this paper, we make the case that it is time to do the same…

Databases · Computer Science 2022-11-21 Apurva Gandhi , Yuki Asada , Victor Fu , Advitya Gemawat , Lihao Zhang , Rathijit Sen , Carlo Curino , Jesús Camacho-Rodríguez , Matteo Interlandi

TUSQ: Targeted High-Utility Sequence Querying

Significant efforts have been expended in the research and development of a database management system (DBMS) that has a wide range of applications for managing an enormous collection of multisource, heterogeneous, complex, or growing data.…

Databases · Computer Science 2021-04-01 Chunkai Zhang , Zilin Du , Quanjian Dai , Wensheng Gan , Jian Weng , Philip S. Yu