English
Related papers

Related papers: Runtime Variation in Big Data Analytics

200 papers

Many algorithms in workflow scheduling and resource provisioning rely on the performance estimation of tasks to produce a scheduling plan. A profiler that is capable of modeling the execution of tasks and predicting their runtime…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-01 Muhammad H. Hilman , Maria A. Rodriguez , Rajkumar Buyya

Many organizations routinely analyze large datasets using systems for distributed data-parallel processing and clusters of commodity resources. Yet, users need to configure adequate resources for their data processing jobs. This requires…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-06-02 Lauritz Thamsen , Dominik Scheinert , Jonathan Will , Jonathan Bader , Odej Kao

Analyzing large datasets with distributed dataflow systems requires the use of clusters. Public cloud providers offer a large variety and quantity of resources that can be used for such clusters. However, picking the appropriate resources…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-04-28 Jonathan Will , Jonathan Bader , Lauritz Thamsen

Infrastructure as a service clouds hide the complexity of maintaining the physical infrastructure with a slight disadvantage: they also hide their internal working details. Should users need knowledge about these details e.g., to increase…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-20 Gabor Kecskemeti , Zsolt Nemeth , Attila Kertesz , Rajiv Ranjan

We present and formalize a general approach for profiling workload by leveraging only a priori available static metadata to supply appropriate resource needs. Understanding the requirements and characteristics of a workload's runtime is…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-30 Andrea Morichetta , Stefan Nastic , Victor Casamayor Pujol , Schahram Dustdar

Dynamic nature of the cloud environment has made distributed resource management process a challenge for cloud service providers. The importance of maintaining the quality of service in accordance with customer expectations as well as the…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-08-08 Sara Kardani-Moghaddam , Rajkumar Buyya , Kotagiri Ramamohanarao

Distributed dataflow systems like Apache Flink and Apache Spark simplify processing large amounts of data on clusters in a data-parallel manner. However, choosing suitable cluster resources for distributed dataflow jobs in both type and…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-14 Jonathan Will , Onur Arslan , Jonathan Bader , Dominik Scheinert , Lauritz Thamsen

The ability to accurately estimate job runtime properties allows a scheduler to effectively schedule jobs. State-of-the-art online cluster job schedulers use history-based learning, which uses past job execution information to estimate the…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-17 Akshay Jajoo , Y. Charlie Hu , Xiaojun Lin , Nan Deng

Many resource management techniques for task scheduling, energy and carbon efficiency, and cost optimization in workflows rely on a-priori task runtime knowledge. Building runtime prediction models on historical data is often not feasible…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-09-14 Jonathan Bader , Fabian Lehmann , Lauritz Thamsen , Ulf Leser , Odej Kao

The workload prediction and resource allocation significantly play an inevitable role in production of an efficient cloud environment. The proactive estimation of future workload followed by decision of resource allocation have become a…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-06-30 Deepika Saxena , Ashutosh Kumar Singh

Cloud performance fluctuates due to factors such as resource contention and workload changes. These factors can be short-term, seasonal, or long-term. Their effects are often intertwined in performance traces, making performance management…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-12 Shimul Debnath , William Hart , Lori Pollock , Donald Lien , Wei Wang

In a large-scale computing cluster, the job completions can be substantially delayed due to two sources of variability, namely, variability in the job size and that in the machine service capacity. To tackle this issue, existing works have…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-07-07 Huanle Xu , Gustavo de Veciana , Wing Cheong Lau , Kunxiao Zhou

Job scheduling in cloud computing environments is a critical yet complex problem. Cloud computing user job requirements are highly dynamic and uncertain, while cloud computing resources are heterogeneous and constrained. This paper studies…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-25 Guang Fang , Yuxiang Zhao

Cloud computing infrastructures increasingly rely on geographically distributed data centers to meet the growing demand for low latency, high availability, and cost-efficient service delivery. In this context, load balancing plays a…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-12 Saeid Aghasoleymani Najafabadi , Elaheh Nabavi Nia

Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori to conduct efficient scheduling. In heterogeneous cluster infrastructures, this problem becomes aggravated because these runtimes are required…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-05-24 Jonathan Bader , Fabian Lehmann , Lauritz Thamsen , Jonathan Will , Ulf Leser , Odej Kao

Performance variability has been acknowledged as a problem for over a decade by cloud practitioners and performance engineers. Yet, our survey of top systems conferences reveals that the research community regularly disregards variability…

Scientific workflow management systems support large-scale data analysis on cluster infrastructures. For this, they interact with resource managers which schedule workflow tasks onto cluster nodes. In addition to workflow task descriptions,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-30 Jonathan Bader , Kathleen West , Soeren Becker , Svetlana Kulagina , Fabian Lehmann , Lauritz Thamsen , Henning Meyerhenke , Odej Kao

High intensive computation applications can usually take days to months to finish an execution. During this time, it is common to have variations of the available resources when considering that such hardware is usually shared among a…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-01-27 Kiran Mantripragada , Alecio Binotto , Leonardo P. Tizzei

We consider robust resource allocation of services in Clouds. More specifically, we consider the case of a large public or private Cloud platform that runs a relatively small set of large and independent services. These services are…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-22 Olivier Beaumont , Lionel Eyraud-Dubois , Paul Renaud-Goud

Performance benchmarking is a common practice in software engineering, particularly when building large-scale, distributed, and data-intensive systems. While cloud environments offer several advantages for running benchmarks, it is often…

Software Engineering · Computer Science 2025-04-17 Sören Henning , Adriano Vogel , Esteban Perez-Wohlfeil , Otmar Ertl , Rick Rabiser
‹ Prev 1 2 3 10 Next ›