Related papers: Using Regression Techniques to Predict Large Data …
Grid Computing is a type of parallel and distributed systems that is designed to provide reliable access to data and computational resources in wide area networks. These resources are distributed in different geographical locations, however…
An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in…
As applications become more distributed to improve user experience and offer higher availability, businesses rely on geographically dispersed datacenters that host such applications more than ever. Dedicated inter-datacenter networks have…
Allocation of (redundant) file chunks throughout a distributed storage system affects important performance metrics such as the probability of file recovery, data download time, or the service rate of the system under a given data access…
Distributed storage systems such as Hadoop File System or Google File System (GFS) ensure data availability and durability using replication. This paper is focused on the analysis of the efficiency of replication mechanism that determines…
Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with…
In grid networks, distributed resources are interconnected by wide area network to support compute and data-intensive applications, which require reliable and efficient transfer of gigabits (even terabits) of data. Different from…
The continuous increase in performance requirements, for both scientific computation and industry, motivates the need of a powerful computing infrastructure. The Grid appeared as a solution for inexpensive execution of heavy applications in…
The overall performance of a distributed system is highly dependent on the communication efficiency of the system. Although network resources (links, bandwidth) are becoming increasingly more available, the communication performance of data…
Grid computing is a collection of computer resources that are gathered together from various areas to give computational resources such as storage, data or application services. This is to permit clients to access this huge measure of…
We focus in this report on two main axes. The first is dedicated to the study of the effect of replicas distribution on data grid performances. In this respect, our main contributions are as follows: 1) An overview of replication strategies…
Data-intensive applications often require exploratory analysis of large datasets. If analysis is performed on distributed resources, data locality can be crucial to high throughput and performance. We propose a "data diffusion" approach…
The LHCb collaboration is one of the four major experiments at the Large Hadron Collider at CERN. Many petabytes of data are produced by the detectors and Monte-Carlo simulations. The LHCb Grid interware LHCbDIRAC is used to make data…
In this paper, we consider the expansion of power grids under emerging large loads from data centers and electrified manufacturing. We develop a multi-period grid capacity expansion model to determine optimal investment profiles for power…
Grid computing is the next logical step to distributed computing. Main objective of grid computing is an innovative approach to share resources such as CPU usage; memory sharing and software sharing. Data Grids provide transparent access to…
The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1]. This study explores the application of streaming machine learning [3] to revolutionize data prefetching within…
A key functionality of emerging connected autonomous systems such as smart transportation systems, smart cities, and the industrial Internet-of-Things, is the ability to process and learn from data collected at different physical locations.…
In this paper, we aim to forecast a future trajectory distribution of a moving agent in the real world, given the social scene images and historical trajectories. Yet, it is a challenging task because the ground-truth distribution is…
Data stream processing is an increasingly important topic due to the prevalence of smart devices and the demand for real-time analytics. Geo-distributed streaming systems, where cloud-based queries utilize data streams from multiple…
Storage allocation affects important performance measures of distributed storage systems. Most previous studies on the storage allocation consider its effect separately either on the success of the data recovery or on the service rate…