Related papers: Diagonal Scaling: A Multi-Dimensional Resource Mod…
The cloud datacenter has numerous hosts as well as application requests where resources are dynamic. The demands placed on the resource allocation are diverse. These factors could lead to load imbalances, which affect scheduling efficiency…
Scheduling query execution plans is a particularly complex problem in shared-nothing parallel systems, where each site consists of a collection of local time-shared (e.g., CPU(s) or disk(s)) and space-shared (e.g., memory) resources and…
A fundamental challenge in large-scale cloud networks and data centers is to achieve highly efficient server utilization and limit energy consumption, while providing excellent user-perceived performance in the presence of uncertain and…
The excessively increased volume of data in modern data management systems demands an improved system performance, frequently provided by data distribution, system scalability and performance optimization techniques. Optimized horizontal…
SQL-on-Hadoop systems, query optimization, data distribution over multiple nodes and parallelization techniques are few of the areas under extreme research these days. Big names like Amazon, Google, Microsoft and many more are working on…
This paper explores resource allocation in serverless cloud computing platforms and proposes an optimization approach for autoscaling systems. Serverless computing relieves users from resource management tasks, enabling focus on application…
With rapid growth in the amount of unstructured data produced by memory-intensive applications, large scale data analytics has recently attracted increasing interest. Processing, managing and analyzing this huge amount of data poses several…
Big data analytics in cloud environments introduces challenges such as real-time load balancing besides security, privacy, and energy efficiency. In this paper, we propose a novel load balancing algorithm in cloud environments that performs…
The dynamic provisioning of virtualized resources offered by cloud computing infrastructures allows applications deployed in a cloud environment to automatically increase and decrease the amount of used resources. This capability is called…
It is now cost-effective to outsource large dataset and perform query over the cloud. However, in this scenario, there exist serious security and privacy issues that sensitive information contained in the dataset can be leaked. The most…
Recent increase in energy prices has led researchers to find better ways for capacity provisioning in data centers to reduce energy wastage due to the variation in workload. This paper explores the opportunity for cost saving utilizing the…
Skyline queries are one of the most widely adopted tools for Multi-Criteria Analysis, with applications covering diverse domains, including, e.g., Database Systems, Data Mining, and Decision Making. Skylines indeed offer a useful overview…
Deep neural networks (DNNs) form the cornerstone of modern AI services, supporting a wide range of applications, including autonomous driving, chatbots, and recommendation systems. As models increase in size and complexity, DNN workloads…
As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate…
Cloud computing infrastructures increasingly rely on geographically distributed data centers to meet the growing demand for low latency, high availability, and cost-efficient service delivery. In this context, load balancing plays a…
Over the past ten years, many different approaches have been proposed for different aspects of the problem of resources management for long running, dynamic and diverse workloads such as processing query streams or distributed deep…
Cloud computing has motivated renewed interest in resource allocation problems with new consumption models. A common goal is to share a resource, such as CPU or I/O bandwidth, among distinct users with different demand patterns as well as…
We study the problem of optimizing data storage and access costs on the cloud while ensuring that the desired performance or latency is unaffected. We first propose an optimizer that optimizes the data placement tier (on the cloud) and the…
Test-time compute scaling has emerged as a powerful paradigm for enhancing mathematical reasoning in large language models (LLMs) by allocating additional computational resources during inference. However, current methods employ uniform…
We consider a large-scale parallel-server system, where each server independently adjusts its processing speed in a decentralized manner. The objective is to minimize the overall cost, which comprises the average cost of maintaining the…