Related papers: Interference-Aware Edge Runtime Prediction with Co…
Accurately predicting end-to-end network latency is essential for enabling reliable task offloading in real-time edge computing applications. This paper introduces a lightweight latency prediction scheme based on rational modelling that…
Accurate prediction of application performance is critical for enabling effective scheduling and resource management in resource-constrained dynamic edge environments. However, achieving predictable performance in such environments remains…
As machine learning inferences increasingly move to edge devices, adapting to diverse computational capabilities, hardware, and memory constraints becomes more critical. Instead of relying on a pre-trained model fixed for all future…
Neural Architecture Search (NAS) has enabled automatic discovery of more efficient neural network architectures, especially for mobile and embedded vision applications. Although recent research has proposed ways of quickly estimating…
Continuous edge inference necessitates not merely low per-timeslot latency, but sustained timeliness guarantees in the presence of time-varying channels, fluctuating edge workloads, and coupled bandwidth-computing resource constraints.…
With the growing workload of inference tasks on mobile devices, state-of-the-art neural architectures (NAs) are typically designed through Neural Architecture Search (NAS) to identify NAs with good tradeoffs between accuracy and efficiency…
The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services, defined as those that perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision, on behalf of…
Many algorithms in workflow scheduling and resource provisioning rely on the performance estimation of tasks to produce a scheduling plan. A profiler that is capable of modeling the execution of tasks and predicting their runtime…
A growing number of critical workflow applications leverage a streamlined edge-hub-cloud architecture, which diverges from the conventional edge computing paradigm. An edge device, in collaboration with a hub device and a cloud server,…
By leveraging the data sample diversity, the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process. However, intermediate classifiers of the early exits introduce…
The demand for stringent interactive quality-of-service has intensified in both mobile edge computing (MEC) and cloud systems, driven by the imperative to improve user experiences. As a result, the processing of computation-intensive tasks…
Future machine learning (ML) powered applications, such as autonomous driving and augmented reality, involve training and inference tasks with timeliness requirements and are communication and computation intensive, which demands for the…
Edge computing enables smart IoT-based systems via concurrent and continuous execution of latency-sensitive machine learning (ML) applications. These edge-based machine learning systems are often battery-powered (i.e., energy-limited). They…
Edge computing operates between the cloud and end users and strives to provide low-latency computing services for simultaneous users. Redundant use of multiple edge nodes can reduce latency, as edge systems often operate in uncertain…
Edge computing is a promising approach for localized data processing for many edge applications and systems including Internet of Things (IoT), where computationally intensive tasks in IoT devices could be divided into sub-tasks and…
Edge computing promises to offer low-latency and ubiquitous computation to numerous devices at the network edge. For delay-sensitive applications, link delays can have a direct impact on service quality. These delays can fluctuate…
Implementing machine learning algorithms on Internet of things (IoT) devices has become essential for emerging applications, such as autonomous driving, environment monitoring. But the limitations of computation capability and energy…
Failed workloads that consumed significant computational resources in time and space affect the efficiency of data centers significantly and thus limit the amount of scientific work that can be achieved. While the computational power has…
In recent years, edge computing has become a popular choice for latency-sensitive applications like facial recognition and augmented reality because it is closer to the end users compared to the cloud. Although infrastructure providers are…
The deployment of inference services at the network edge, called edge inference, offloads computation-intensive inference tasks from mobile devices to edge servers, thereby enhancing the former's capabilities and battery lives. In a…