Related papers: Load Balancing for MapReduce-based Entity Resoluti…
Cloud infrastructures enable the efficient parallel execution of data-intensive tasks such as entity resolution on large datasets. We investigate challenges and possible solutions of using the MapReduce programming model for parallel entity…
Load balance is important for MapReduce to reduce job duration, increase parallel efficiency, etc. Previous work focuses on coarse-grained scheduling. This study concerns fine-grained scheduling on MapReduce operations. Each operation…
Recently, MapReduce based spatial query systems have emerged as a cost effective and scalable solution to large scale spatial data processing and analytics. MapReduce based systems achieve massive scalability by partitioning the data and…
Load Balancing is a fundamental technology for scaling cloud infrastructure. It enables systems to distribute incoming traffic across backend servers using predefined algorithms such as round robin, weighted round robin, least connections,…
Big data analytics in cloud environments introduces challenges such as real-time load balancing besides security, privacy, and energy efficiency. In this paper, we propose a novel load balancing algorithm in cloud environments that performs…
Mobile edge computing (MEC) enables low-latency and high-bandwidth applications by bringing computation and data storage closer to end-users. Intelligent computing is an important application of MEC, where computing resources are used to…
In this paper, a joint task, spectrum, and transmit power allocation problem is investigated for a wireless network in which the base stations (BSs) are equipped with mobile edge computing (MEC) servers to jointly provide computational and…
In this work, a heterogeneous set of wireless devices sharing a common access point collaborates to perform a set of tasks. Using the Map-Reduce distributed computing framework, the tasks are optimally distributed amongst the nodes with the…
A fundamental challenge in large-scale networked systems viz., data centers and cloud networks is to distribute tasks to a pool of servers, using minimal instantaneous state information, while providing excellent delay performance. In this…
Cloud computing is a newly emerging distributed computing which is evolved from Grid computing. Task scheduling is the core research of cloud computing which studies how to allocate the tasks among the physical nodes so that the tasks can…
Mobile-edge computing (MEC) has emerged as a prominent technique to provide mobile services with high computation requirement, by migrating the computation-intensive tasks from the mobile devices to the nearby MEC servers. To reduce the…
While load balancing in distributed-memory computing has been well-studied, we present an innovative approach to this problem: a unified, reduced-order model that combines three key components to describe "work" in a distributed system:…
Edge computing has become a very popular service that enables mobile devices to run complex tasks with the help of network-based computing resources. However, edge clouds are often resource-constrained, which makes resource allocation a…
Although Metaverse has recently been widely studied, its practical application still faces many challenges. One of the severe challenges is the lack of sufficient resources for computing and communication on local devices, resulting in the…
Data locality is a fundamental issue for data-parallel applications. Considering MapReduce in Hadoop, the map task scheduling part requires an efficient algorithm which takes data locality into consideration; otherwise, the system may…
MapReduce is a commonly used framework for executing data-intensive jobs on distributed server clusters. We introduce a variant implementation of MapReduce, namely "Coded MapReduce", to substantially reduce the inter-server communication…
Edge computing is an emerging paradigm to enable low-latency applications, like mobile augmented reality, because it takes the computation on processing devices that are closer to the users. On the other hand, the need for highly scalable…
Large scale clusters leveraging distributed computing frameworks such as MapReduce routinely process data that are on the orders of petabytes or more. The sheer size of the data precludes the processing of the data on a single computer. The…
While large-scale distributed data processing platforms have become an attractive target for query processing, these systems are problematic for applications that deal with nested collections. Programmers are forced either to perform…
Nowadays, the data to be processed by database systems has grown so large that any conventional, centralized technique is inadequate. At the same time, general purpose computation on GPU (GPGPU) recently has successfully drawn attention…