Related papers: Workload-Driven Vertical Partitioning for Effectiv…

Prediction of Horizontal Data Partitioning Through Query Execution Cost Estimation

The excessively increased volume of data in modern data management systems demands an improved system performance, frequently provided by data distribution, system scalability and performance optimization techniques. Optimized horizontal…

Machine Learning · Computer Science 2019-11-27 Nino Arsov , Goran Velinov , Aleksandar S. Dimovski , Bojana Koteska , Dragan Sahpaski , Margina Kon-Popovska

Multi-Resource Parallel Query Scheduling and Optimization

Scheduling query execution plans is a particularly complex problem in shared-nothing parallel systems, where each site consists of a collection of local time-shared (e.g., CPU(s) or disk(s)) and space-shared (e.g., memory) resources and…

Databases · Computer Science 2014-04-01 Minos Garofalakis , Yannis Ioannidis

Hyper-Graph Based Database Partitioning for Transactional Workloads

A common approach to scaling transactional databases in practice is horizontal partitioning, which increases system scalability, high availability and self-manageability. Usu- ally it is very challenging to choose or design an optimal…

Databases · Computer Science 2013-09-09 Yu cao , Xiaoyan Guo , Stephen Todd

Query Complexity Based Optimal Processing of Raw Data

The paper aims to find an efficient way for processing large datasets having different types of workload queries with minimal replication. The work first identifies the complexity of queries best suited for the given data processing tool .…

Databases · Computer Science 2022-12-22 Mayank Patel , Minal Bhise

A Hybrid Heuristic Framework for Resource-Efficient Querying of Scientific Experiments Data

Scientific experiments and modern applications are generating large amounts of data every day. Most organizations utilize In-house servers or Cloud resources to manage application data and workload. The traditional database management…

Databases · Computer Science 2025-06-17 Mayank Patel , Minal Bhise

Vertical partitioning of relational OLTP databases using integer programming

A way to optimize performance of relational row store databases is to reduce the row widths by vertically partitioning tables into table fractions in order to minimize the number of irrelevant columns/attributes read by each transaction.…

Databases · Computer Science 2010-02-16 Rasmus Resen Amossen

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

With the explosive growth of big data, workloads tend to get more complex and computationally demanding. Such applications are processed on distributed interconnected resources that are becoming larger in scale and computational capacity.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-30 Georgios L. Stavrinides , Helen D. Karatza

(Re)partitioning for stream-enabled computation

Partitioning an input graph over a set of workers is a complex operation. Objectives are twofold: split the work evenly, so that every worker gets an equal share, and minimize edge cut to achieve a good work locality (i.e. workers can work…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-11-28 Le Merrer Erwan , Liang Yizhong , Trédan Gilles

Parallel Stream Processing Against Workload Skewness and Variance

Key-based workload partitioning is a common strategy used in parallel stream processing engines, enabling effective key-value tuple distribution over worker threads in a logical operator. While randomized hashing on the keys is capable of…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-12-14 Junhua Fang , Rong Zhang , Tom Z. J. Fu , Zhenjie Zhang , Aoying Zhou , Junhua Zhu

Semi-Federated Scheduling of Parallel Real-Time Tasks on Multiprocessors

Federated scheduling is a promising approach to schedule parallel real-time tasks on multi-cores, where each heavy task exclusively executes on a number of dedicated processors, while light tasks are treated as sequential sporadic tasks and…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-10 Xu Jiang , Nan Guan , Xiang Long , Wang Yi

Comprehensive and Efficient Workload Compression

This work studies the problem of constructing a representative workload from a given input analytical query workload where the former serves as an approximation with guarantees of the latter. We discuss our work in the context of workload…

Databases · Computer Science 2021-02-04 Shaleen Deep , Anja Gruenheid , Paraschos Koutris , Jeffrey Naughton , Stratis Viglas

Dynamic Scheduling of a Parallel-Server Queueing System: A Computational Method for High-Dimensional Problems

A key operational challenge for call centers is to decide, in real time, which waiting customer should be served by which available agent. This is known as skill-based routing, and the decision becomes especially difficult in large systems…

Systems and Control · Electrical Eng. & Systems 2026-05-12 Baris Ata , Ebru Kasikaralar

Semi-Partitioned Hard Real-Time Scheduling with Restricted Migrations upon Identical Multiprocessor Platforms

Algorithms based on semi-partitioned scheduling have been proposed as a viable alternative between the two extreme ones based on global and partitioned scheduling. In particular, allowing migration to occur only for few tasks which cannot…

Operating Systems · Computer Science 2010-06-15 François Dorin , Patrick Meumeu Yomsi , Joël Goossens , Pascal Richard

Refining the Complexity Landscape of Speed Scaling: Hardness and Algorithms

We study the computational complexity of scheduling jobs on a single speed-scalable processor with the objective of capturing the trade-off between the (weighted) flow time and the energy consumption. This trade-off has been extensively…

Data Structures and Algorithms · Computer Science 2026-02-13 Antonios Antoniadis , Denise Graafsma , Ruben Hoeksma , Maria Vlasiou

Query Workload-based RDF Graph Fragmentation and Allocation

As the volume of the RDF data becomes increasingly large, it is essential for us to design a distributed database system to manage it. For distributed RDF data design, it is quite common to partition the RDF data into some parts, called…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-02-23 Peng Peng , Lei Zou , Lei Chen , Dongyan Zhao

Approximate Partition Selection for Big-Data Workloads using Summary Statistics

Many big-data clusters store data in large partitions that support access at a coarse, partition-level granularity. As a result, approximate query processing via row-level sampling is inefficient, often requiring reads of many partitions.…

Databases · Computer Science 2020-08-25 Kexin Rong , Yao Lu , Peter Bailis , Srikanth Kandula , Philip Levis

A Query-Driven Approach to Space-Efficient Range Searching

We initiate a study of a query-driven approach to designing partition trees for range-searching problems. Our model assumes that a data structure is to be built for an unknown query distribution that we can access through a sampling oracle,…

Data Structures and Algorithms · Computer Science 2025-02-20 Dimitris Fotakis , Andreas Kalavas , Ioannis Psarros

Power-aware scheduling for makespan and flow

We consider offline scheduling algorithms that incorporate speed scaling to address the bicriteria problem of minimizing energy consumption and a scheduling metric. For makespan, we give linear-time algorithms to compute all non-dominated…

Data Structures and Algorithms · Computer Science 2007-05-23 David P. Bunde

DeepDB: Learn from Data, not from Queries!

The typical approach for learned DBMS components is to capture the behavior by running a representative set of queries and use the observations to train a machine learning model. This workload-driven approach, however, has two major…

Databases · Computer Science 2019-09-04 Benjamin Hilprecht , Andreas Schmidt , Moritz Kulessa , Alejandro Molina , Kristian Kersting , Carsten Binnig

Prepartition: Load Balancing Approach for Virtual Machine Reservations in a Cloud Data Center

Load balancing is vital for the efficient and long-term operation of cloud data centers. With virtualization, post (reactive) migration of virtual machines after allocation is the traditional way for load balancing and consolidation.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-10-20 Wenhong Tian , Minxian Xu , Guangyao Zhou , Kui Wu , Chengzhong Xu , Rajkumar Buyya