Related papers: Accelerating Large-scale Data Exploration through …

Data Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data Intensive Applications

Data intensive applications often involve the analysis of large datasets that require large amounts of compute and storage resources. While dedicated compute and/or storage farms offer good task/data throughput, they suffer low resource…

Distributed, Parallel, and Cluster Computing · Computer Science 2008-08-27 Ioan Raicu , Yong Zhao , Ian Foster , Alex Szalay

Research on Heterogeneous Computation Resource Allocation based on Data-driven Method

The rapid development of the mobile Internet and the Internet of Things is leading to a diversification of user devices and the emergence of new mobile applications on a regular basis. Such applications include those that are…

Computational Engineering, Finance, and Science · Computer Science 2024-08-13 Xirui Tang , Zeyu Wang , Xiaowei Cai , Honghua Su , Changsong Wei

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

With the explosive growth of big data, workloads tend to get more complex and computationally demanding. Such applications are processed on distributed interconnected resources that are becoming larger in scale and computational capacity.…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-30 Georgios L. Stavrinides , Helen D. Karatza

Distributed Resource Selection for Self-Organising Cloud-Edge Systems

This paper presents a distributed resource selection mechanism for diverse cloud-edge environments, enabling dynamic and context-aware allocation of resources to meet the demands of complex distributed applications. By distributing the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-10 Quentin Renau , Amjad Ullah , Emma Hart

DiCache: Let Diffusion Model Determine Its Own Cache

Recent years have witnessed the rapid development of acceleration techniques for diffusion models, especially caching-based acceleration methods. These studies seek to answer two fundamental questions: "When to cache" and "How to use…

Computer Vision and Pattern Recognition · Computer Science 2025-10-03 Jiazi Bu , Pengyang Ling , Yujie Zhou , Yibin Wang , Yuhang Zang , Dahua Lin , Jiaqi Wang

Using Regression Techniques to Predict Large Data Transfers

The recent proliferation of Data Grids and the increasingly common practice of using resources as distributed data stores provide a convenient environment for communities of researchers to share, replicate, and manage access to copies of…

Distributed, Parallel, and Cluster Computing · Computer Science 2007-05-23 Sudharshan Vazhkudai , Jennifer M. Schopf

Data Placement and Replica Selection for Improving Co-location in Distributed Environments

Increasing need for large-scale data analytics in a number of application domains has led to a dramatic rise in the number of distributed data management systems, both parallel relational databases, and systems that support alternative…

Databases · Computer Science 2013-02-19 K. Ashwin Kumar , Amol Deshpande , Samir Khuller

Data Analytics for Fog Computing by Distributed Online Learning with Asynchronous Update

Fog computing extends the cloud computing paradigm by allocating substantial portions of computations and services towards the edge of a network, and is, therefore, particularly suitable for large-scale, geo-distributed, and data-intensive…

Signal Processing · Electrical Eng. & Systems 2019-12-03 Guangxia Li , Peilin Zhao , Xiao Lu , Jia Liu , Yulong Shen

Towards a Centralized Scheduling Framework for Communication Flows in Distributed Systems

The overall performance of a distributed system is highly dependent on the communication efficiency of the system. Although network resources (links, bandwidth) are becoming increasingly more available, the communication performance of data…

Data Structures and Algorithms · Computer Science 2009-06-02 Mugurel Ionut Andreica , Eliana-Dina Tirsa , Nicolae Tapus , Florin Pop , Ciprian Mihai Dobre

Response-Time-Optimized Distributed Cloud Resource Allocation

A current trend in networking and cloud computing is to provide compute resources over widely dispersed places exemplified by initiatives like Network Function Virtualisation. This paves the way for a widespread service deployment and can…

Networking and Internet Architecture · Computer Science 2016-05-31 Matthias Keller , Holger Karl

Analysis of Distributed Algorithms for Big-data

The parallel and distributed processing are becoming de facto industry standard, and a large part of the current research is targeted on how to make computing scalable and distributed, dynamically, without allocating the resources on…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-10 Rajendra Purohit , K R Chowdhary , S D Purohit

Fast-Fourier-Forecasting Resource Utilisation in Distributed Systems

Distributed computing systems often consist of hundreds of nodes, executing tasks with different resource requirements. Efficient resource provisioning and task scheduling in such systems are non-trivial and require close monitoring and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-08-10 Paul J. Pritz , Daniel Perez , Kin K. Leung

Simple Hierarchical Planning with Diffusion

Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for…

Machine Learning · Computer Science 2024-01-08 Chang Chen , Fei Deng , Kenji Kawaguchi , Caglar Gulcehre , Sungjin Ahn

Distributed Caching for Complex Querying of Raw Arrays

As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate…

Databases · Computer Science 2018-03-19 Weijie Zhao , Florin Rusu , Bin Dong , Kesheng Wu , Anna Y. Q. Ho , Peter Nugent

Dynamic Resource Management in Clouds: A Probabilistic Approach

Dynamic resource management has become an active area of research in the Cloud Computing paradigm. Cost of resources varies significantly depending on configuration for using them. Hence efficient management of resources is of prime…

Networking and Internet Architecture · Computer Science 2012-10-03 Paulo Gonçalves , Shubhabrata Roy , Thomas Begin , Patrick Loiseau

Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases

Modern cloud databases present scaling as a binary decision: scale-out by adding nodes or scale-up by increasing per-node resources. This one-dimensional view is limiting because database performance, cost, and coordination overhead emerge…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-05 Shahir Abdullah , Syed Rohit Zaman

Fast communication-efficient spectral clustering over distributed data

The last decades have seen a surge of interests in distributed computing thanks to advances in clustered computing and big data technology. Existing distributed algorithms typically assume {\it all the data are already in one place}, and…

Machine Learning · Computer Science 2019-05-07 Donghui Yan , Yingjie Wang , Jin Wang , Guodong Wu , Honggang Wang

Towards a Peer-to-Peer Data Distribution Layer for Efficient and Collaborative Resource Optimization of Distributed Dataflow Applications

Performance modeling can help to improve the resource efficiency of clusters and distributed dataflow applications, yet the available modeling data is often limited. Collaborative approaches to performance modeling, characterized by the…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-24 Dominik Scheinert , Soeren Becker , Jonathan Will , Luis Englaender , Lauritz Thamsen

Distribution-Aware Data Expansion with Diffusion Models

The scale and quality of a dataset significantly impact the performance of deep models. However, acquiring large-scale annotated datasets is both a costly and time-consuming endeavor. To address this challenge, dataset expansion…

Computer Vision and Pattern Recognition · Computer Science 2024-06-06 Haowei Zhu , Ling Yang , Jun-Hai Yong , Hongzhi Yin , Jiawei Jiang , Meng Xiao , Wentao Zhang , Bin Wang

ROBUS: Fair Cache Allocation for Multi-tenant Data-parallel Workloads

Systems for processing big data---e.g., Hadoop, Spark, and massively parallel databases---need to run workloads on behalf of multiple tenants simultaneously. The abundant disk-based storage in these systems is usually complemented by a…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-12 Mayuresh Kunjir , Brandon Fain , Kamesh Munagala , Shivnath Babu