Related papers: Accelerating Large-scale Data Exploration through …
Data intensive applications often involve the analysis of large datasets that require large amounts of compute and storage resources. While dedicated compute and/or storage farms offer good task/data throughput, they suffer low resource…
The rapid development of the mobile Internet and the Internet of Things is leading to a diversification of user devices and the emergence of new mobile applications on a regular basis. Such applications include those that are…
With the explosive growth of big data, workloads tend to get more complex and computationally demanding. Such applications are processed on distributed interconnected resources that are becoming larger in scale and computational capacity.…
This paper presents a distributed resource selection mechanism for diverse cloud-edge environments, enabling dynamic and context-aware allocation of resources to meet the demands of complex distributed applications. By distributing the…
Recent years have witnessed the rapid development of acceleration techniques for diffusion models, especially caching-based acceleration methods. These studies seek to answer two fundamental questions: "When to cache" and "How to use…
The recent proliferation of Data Grids and the increasingly common practice of using resources as distributed data stores provide a convenient environment for communities of researchers to share, replicate, and manage access to copies of…
Increasing need for large-scale data analytics in a number of application domains has led to a dramatic rise in the number of distributed data management systems, both parallel relational databases, and systems that support alternative…
Fog computing extends the cloud computing paradigm by allocating substantial portions of computations and services towards the edge of a network, and is, therefore, particularly suitable for large-scale, geo-distributed, and data-intensive…
The overall performance of a distributed system is highly dependent on the communication efficiency of the system. Although network resources (links, bandwidth) are becoming increasingly more available, the communication performance of data…
A current trend in networking and cloud computing is to provide compute resources over widely dispersed places exemplified by initiatives like Network Function Virtualisation. This paves the way for a widespread service deployment and can…
The parallel and distributed processing are becoming de facto industry standard, and a large part of the current research is targeted on how to make computing scalable and distributed, dynamically, without allocating the resources on…
Distributed computing systems often consist of hundreds of nodes, executing tasks with different resource requirements. Efficient resource provisioning and task scheduling in such systems are non-trivial and require close monitoring and…
Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for…
As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate…
Dynamic resource management has become an active area of research in the Cloud Computing paradigm. Cost of resources varies significantly depending on configuration for using them. Hence efficient management of resources is of prime…
Modern cloud databases present scaling as a binary decision: scale-out by adding nodes or scale-up by increasing per-node resources. This one-dimensional view is limiting because database performance, cost, and coordination overhead emerge…
The last decades have seen a surge of interests in distributed computing thanks to advances in clustered computing and big data technology. Existing distributed algorithms typically assume {\it all the data are already in one place}, and…
Performance modeling can help to improve the resource efficiency of clusters and distributed dataflow applications, yet the available modeling data is often limited. Collaborative approaches to performance modeling, characterized by the…
The scale and quality of a dataset significantly impact the performance of deep models. However, acquiring large-scale annotated datasets is both a costly and time-consuming endeavor. To address this challenge, dataset expansion…
Systems for processing big data---e.g., Hadoop, Spark, and massively parallel databases---need to run workloads on behalf of multiple tenants simultaneously. The abundant disk-based storage in these systems is usually complemented by a…