Related papers: Fast Updates on Read-Optimized Databases Using Mul…
In-memory columnar databases have become mainstream over the last decade and have vastly improved the fast processing of large volumes of data through multi-core parallelism and in-memory compression thereby eliminating the usual…
Maintaining a dynamic $k$-core decomposition is an important problem that identifies dense subgraphs in dynamically changing graphs. Recent work by Liu et al. [SPAA 2022] presents a parallel batch-dynamic algorithm for maintaining an…
To process a large volume of data, modern data management systems use a collection of machines connected through a network. This paper looks into the feasibility of scaling up such a shared-nothing system while processing a compute- and…
This study proposes a novel storage engine, SynchroStore, designed to address the inefficiency of update operations in columnar storage systems based on Log-Structured Merge Trees (LSM-Trees) under hybrid workload scenarios. While columnar…
Main memory column-stores have proven to be efficient for processing analytical queries. Still, there has been much less work in the context of clusters. Using only a single machine poses several restrictions: Processing power and data…
Heterogeneous multi-core architectures combine a few "host" cores, optimized for single-thread performance, with many small energy-efficient "accelerator" cores for data-parallel processing, on a single chip. Offloading a computation to the…
Current main memory database system architectures are still challenged by high contention workloads and this challenge will continue to grow as the number of cores in processors continues to increase. These systems schedule transactions…
We revisit column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMSs). Similar to column-oriented RDBMSs, GDBMSs support read-heavy analytical workloads that however…
Growing main memory sizes have facilitated database management systems that keep the entire database in main memory. The drastic performance improvements that came along with these in-memory systems have made it possible to reunite the two…
There is increasing interest in using multicore processors to accelerate stream processing. For example, indexing sliding window content to enhance the performance of streaming queries is greatly improved by utilizing the computational…
The proliferation of modern data processing tools has given rise to open-source columnar data formats. The advantage of these formats is that they help organizations avoid repeatedly converting data to a new format for each application.…
In modern large-scale distributed systems, analytics jobs submitted by various users often share similar work, for example scanning and processing the same subset of data. Instead of optimizing jobs independently, which may result in…
SQL-on-Hadoop systems, query optimization, data distribution over multiple nodes and parallelization techniques are few of the areas under extreme research these days. Big names like Amazon, Google, Microsoft and many more are working on…
Modern applications demand high performance and cost efficient database management systems (DBMSs). Their workloads may be diverse, ranging from online transaction processing to analytics and decision support. The cloud infrastructure…
Columnar databases are an established way to speed up online analytical processing (OLAP) queries. Nowadays, data processing (e.g., storage, visualization, and analytics) is often performed at the programming language level, hence it is…
In high performance computing, researchers try to optimize the CPU Scheduling algorithms, for faster and efficient working of computers. But a process needs both CPU bound and I/O bound for completion of its execution. With modernization of…
Data processing systems offer an ever increasing degree of parallelism on the levels of cores, CPUs, and processing nodes. Query optimization must exploit high degrees of parallelism in order not to gradually become the bottleneck of query…
The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For…
A new emerging class of parallel database management systems (DBMS) is designed to take advantage of the partitionable workloads of on-line transaction processing (OLTP) applications. Transactions in these systems are optimized to execute…
Motivated by large-scale optimization problems arising in the context of machine learning, there have been several advances in the study of asynchronous parallel and distributed optimization methods during the past decade. Asynchronous…