Related papers: Interference-Aware Edge Runtime Prediction with Co…

Lightweight Latency Prediction Scheme for Edge Applications: A Rational Modelling Approach

Accurately predicting end-to-end network latency is essential for enabling reliable task offloading in real-time edge computing applications. This paper introduces a lightweight latency prediction scheme based on rational modelling that…

Networking and Internet Architecture · Computer Science 2025-11-05 Mohan Liyanage , Eldiyar Zhantileuov , Ali Kadhum Idrees , Rolf Schuster

Accurate Performance Predictors for Edge Computing Applications

Accurate prediction of application performance is critical for enabling effective scheduling and resource management in resource-constrained dynamic edge environments. However, achieving predictable performance in such environments remains…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-24 Panagiotis Giannakopoulos , Bart van Knippenberg , Kishor Chandra Joshi , Nicola Calabretta , George Exarchakos

QPART: Adaptive Model Quantization and Dynamic Workload Balancing for Accuracy-aware Edge Inference

As machine learning inferences increasingly move to edge devices, adapting to diverse computational capabilities, hardware, and memory constraints becomes more critical. Instead of relying on a pre-trained model fixed for all future…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-01 Xiangchen Li , Saeid Ghafouri , Bo Ji , Hans Vandierendonck , Deepu John , Dimitrios S. Nikolopoulos

MAPLE-Edge: A Runtime Latency Predictor for Edge Devices

Neural Architecture Search (NAS) has enabled automatic discovery of more efficient neural network architectures, especially for mobile and embedded vision applications. Although recent research has proposed ways of quickly estimating…

Machine Learning · Computer Science 2022-04-28 Saeejith Nair , Saad Abbasi , Alexander Wong , Mohammad Javad Shafiee

Risk-Budgeted Online Scheduling for Continuous Edge Inference over Evolving Time Horizons

Continuous edge inference necessitates not merely low per-timeslot latency, but sustained timeliness guarantees in the presence of time-varying channels, fluctuating edge workloads, and coupled bandwidth-computing resource constraints.…

Networking and Internet Architecture · Computer Science 2026-05-05 Houyi Qi , Minghui Liwang , Sai Zou , Wei Ni

Inference Latency Prediction at the Edge

With the growing workload of inference tasks on mobile devices, state-of-the-art neural architectures (NAs) are typically designed through Neural Architecture Search (NAS) to identify NAs with good tradeoffs between accuracy and efficiency…

Performance · Computer Science 2022-10-07 Zhuojin Li , Marco Paolieri , Leana Golubchik

Scheduling Real-time Deep Learning Services as Imprecise Computations

The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services, defined as those that perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision, on behalf of…

Machine Learning · Computer Science 2020-11-03 Shuochao Yao , Yifan Hao , Yiran Zhao , Huajie Shao , Dongxin Liu , Shengzhong Liu , Tianshi Wang , Jinyang Li , Tarek Abdelzaher

Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach

Many algorithms in workflow scheduling and resource provisioning rely on the performance estimation of tasks to produce a scheduling plan. A profiler that is capable of modeling the execution of tasks and predicting their runtime…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-01 Muhammad H. Hilman , Maria A. Rodriguez , Rajkumar Buyya

A reliability- and latency-driven task allocation framework for workflow applications in the edge-hub-cloud continuum

A growing number of critical workflow applications leverage a streamlined edge-hub-cloud architecture, which diverges from the conventional edge computing paradigm. An edge device, in collaboration with a hub device and a cloud server,…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-23 Andreas Kouloumpris , Georgios L. Stavrinides , Maria K. Michael , Theocharis Theocharides

Resource-Constrained Edge AI with Early Exit Prediction

By leveraging the data sample diversity, the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process. However, intermediate classifiers of the early exits introduce…

Machine Learning · Computer Science 2022-06-22 Rongkang Dong , Yuyi Mao , Jun Zhang

Deadline-Aware Joint Task Scheduling and Offloading in Mobile Edge Computing Systems

The demand for stringent interactive quality-of-service has intensified in both mobile edge computing (MEC) and cloud systems, driven by the imperative to improve user experiences. As a result, the processing of computation-intensive tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-28 Ngoc Hung Nguyen , Van-Dinh Nguyen , Anh Tuan Nguyen , Nguyen Van Thieu , Hoang Nam Nguyen , Symeon Chatzinotas

Edge Learning with Timeliness Constraints: Challenges and Solutions

Future machine learning (ML) powered applications, such as autonomous driving and augmented reality, involve training and inference tasks with timeliness requirements and are communication and computation intensive, which demands for the…

Networking and Internet Architecture · Computer Science 2020-09-24 Yuxuan Sun , Wenqi Shi , Xiufeng Huang , Sheng Zhou , Zhisheng Niu

FELARE: Fair Scheduling of Machine Learning Tasks on Heterogeneous Edge Systems

Edge computing enables smart IoT-based systems via concurrent and continuous execution of latency-sensitive machine learning (ML) applications. These edge-based machine learning systems are often battery-powered (i.e., energy-limited). They…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-22 Ali Mokhtari , Md Abir Hossen , Pooyan Jamshidi , Mohsen Amini Salehi

Redundancy Management for Fast Service (Rates) in Edge Computing Systems

Edge computing operates between the cloud and end users and strives to provide low-latency computing services for simultaneous users. Redundant use of multiple edge nodes can reduce latency, as edge systems often operate in uncertain…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-11-26 Pei Peng , Emina Soljanin

Predictive Edge Computing with Hard Deadlines

Edge computing is a promising approach for localized data processing for many edge applications and systems including Internet of Things (IoT), where computationally intensive tasks in IoT devices could be divided into sub-tasks and…

Networking and Internet Architecture · Computer Science 2018-06-01 Yuxuan Xing , Hulya Seferoglu

Delay-Aware Robust Edge Network Hardening Under Decision-Dependent Uncertainty

Edge computing promises to offer low-latency and ubiquitous computation to numerous devices at the network edge. For delay-sensitive applications, link delays can have a direct impact on service quality. These delays can fluctuate…

Networking and Internet Architecture · Computer Science 2025-03-04 Jiaming Cheng , Duong Thuy Anh Nguyen , Ni Trieu , Duong Tung Nguyen

Dynamic Compression Ratio Selection for Edge Inference Systems with Hard Deadlines

Implementing machine learning algorithms on Internet of things (IoT) devices has become essential for emerging applications, such as autonomous driving, environment monitoring. But the limitations of computation capability and energy…

Information Theory · Computer Science 2020-05-26 Xiufeng Huang , Sheng Zhou

Workload Failure Prediction for Data Centers

Failed workloads that consumed significant computational resources in time and space affect the efficiency of data centers significantly and thus limit the amount of scientific work that can be achieved. While the computational power has…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-01-13 Jie Li , Rui Wang , Ghazanfar Ali , Tommy Dang , Alan Sill , Yong Chen

I-BOT: Interference-Based Orchestration of Tasks for Dynamic Unmanaged Edge Computing

In recent years, edge computing has become a popular choice for latency-sensitive applications like facial recognition and augmented reality because it is closer to the end users compared to the cloud. Although infrastructure providers are…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-11-12 Shikhar Suryavansh , Chandan Bothra , Kwang Taik Kim , Mung Chiang , Chunyi Peng , Saurabh Bagchi

Resource Allocation for Multiuser Edge Inference with Batching and Early Exiting (Extended Version)

The deployment of inference services at the network edge, called edge inference, offloads computation-intensive inference tasks from mobile devices to edge servers, thereby enhancing the former's capabilities and battery lives. In a…

Information Theory · Computer Science 2023-01-02 Zhiyan Liu , Qiao Lan , Kaibin Huang