English
Related papers

Related papers: Scheduling Real-time Deep Learning Services as Imp…

200 papers

In recent years, the development of specialized edge computing devices has significantly increased, driven by the growing demand for AI models. These devices, such as the NVIDIA Jetson series, must efficiently handle increased data…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-03 Ashiyana Abdul Majeed , Mahmoud Meribout

As machine learning techniques become ubiquitous, the efficiency of neural network implementations is becoming correspondingly paramount. Frameworks, such as Halide and TVM, separate out the algorithmic representation of the network from…

Machine Learning · Computer Science 2020-12-01 Benoit Steiner , Chris Cummins , Horace He , Hugh Leather

The ubiquity of smartphone cameras and IoT cameras, together with the recent boom of deep learning and deep neural networks, proliferate various computer vision driven mobile and IoT applications deployed on the edge. This paper focuses on…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-06 Zhe Yang , Klara Nahrstedt , Hongpeng Guo , Qian Zhou

As edge computing expands, serving multiple deep neural network (DNN) models on a single shared GPU has become a common yet challenging scenario, where each scheduling decision affects the tail latency of all concurrent queues. Existing…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-05-08 Jiahe Cao , Xiaomeng Li , Qiang Liu , Tao Han , Ning Zhang , Weisong Shi

Future machine learning (ML) powered applications, such as autonomous driving and augmented reality, involve training and inference tasks with timeliness requirements and are communication and computation intensive, which demands for the…

Networking and Internet Architecture · Computer Science 2020-09-24 Yuxuan Sun , Wenqi Shi , Xiufeng Huang , Sheng Zhou , Zhisheng Niu

With the continuous expansion of the scale of cloud computing applications, artificial intelligence technologies such as Deep Learning and Reinforcement Learning have gradually become the key tools to solve the automated task scheduling of…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-14 Zheng Xu , Yulu Gong , Yanlin Zhou , Qiaozhi Bao , Wenpin Qian

The demand for stringent interactive quality-of-service has intensified in both mobile edge computing (MEC) and cloud systems, driven by the imperative to improve user experiences. As a result, the processing of computation-intensive tasks…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-28 Ngoc Hung Nguyen , Van-Dinh Nguyen , Anh Tuan Nguyen , Nguyen Van Thieu , Hoang Nam Nguyen , Symeon Chatzinotas

Motivated by deep neural network applications, we study the problem of scheduling splittable jobs (e.g., neural network inference tasks) on configurable machines (e.g., multi-instance GPUs). We are given $n$ jobs and a set $C$ of…

Data Structures and Algorithms · Computer Science 2023-12-12 Matthew Casey , Rajmohan Rajaraman , David Stalfa

Distributed cloud environments hosting data-intensive applications often experience slowdowns due to network congestion, asymmetric bandwidth, and inter-node data shuffling. These factors are typically not captured by traditional host-level…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-21 Sankalpa Timilsina , Susmit Shannigrahi

Access to parallel and distributed computation has enabled researchers and developers to improve algorithms and performance in many applications. Recent research has focused on next generation special purpose systems with multiple kinds of…

Machine Learning · Computer Science 2019-06-11 Tegg Taekyong Sung , Valliappa Chockalingam , Alex Yahja , Bo Ryu

Minimizing job scheduling time is a fundamental issue in data center networks that has been extensively studied in recent years. The incoming jobs require different CPU and memory units, and span different number of time slots. The…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-21 Weijia Chen , Yuedong Xu , Xiaofeng Wu

We are interested in the optimal scheduling of a collection of multi-component application jobs in an edge computing system that consists of geo-distributed edge computing nodes connected through a wide area network. The scheduling and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-24 Zhi Cao , Honggang Zhang , Yu Cao , Benyuan Liu

While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation and the vision of the Internet-of-Things fuel the interest in resource efficient approaches. These approaches require a carefully…

Deep learning has been effectively applied to many discrete optimization problems. However, learning-based scheduling on unrelated parallel machines remains particularly difficult to design. Not only do the numbers of jobs and machines…

Machine Learning · Computer Science 2025-12-23 Diego Hitzges , Guillaume Sagnol

Edge computing has become a promising computing paradigm for building IoT (Internet of Things) applications, particularly for applications with specific constraints such as latency or privacy requirements. Due to resource constraints at the…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-08-15 Fei Hu , Kunal Mehta , Shivakant Mishra , Mohammad AlMutawa

Many real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) to process inference tasks. Edge computing is considered a key infrastructure to deploy such applications, as moving…

As deep neural networks (DNNs) are being applied to a wide range of edge intelligent applications, it is critical for edge inference platforms to have both high-throughput and low-latency at the same time. Such edge platforms with multiple…

Machine Learning · Computer Science 2023-05-03 Ziyang Zhang , Huan Li , Yang Zhao , Changyao Lin , Jie Liu

After completing the design and training phases, deploying a deep learning model onto specific hardware is essential before practical implementation. Targeted optimizations are necessary to enhance the model's performance by reducing…

Human-Computer Interaction · Computer Science 2023-08-10 Laixin Xie , Chenyang Zhang , Ruofei Ma , Xing Jiang , Xingxing Xing , Wei Wan , Quan Li

Accurately estimating workload runtime is a longstanding goal in computer systems, and plays a key role in efficient resource provisioning, latency minimization, and various other system management tasks. Runtime prediction is particularly…

Machine Learning · Computer Science 2025-03-11 Tianshu Huang , Arjun Ramesh , Emily Ruppel , Nuno Pereira , Anthony Rowe , Carlee Joe-Wong

Motivated by the proliferation of Internet-of-Thing (IoT) devices and the rapid advances in the field of deep learning, there is a growing interest in pushing deep learning computations, conventionally handled by the cloud, to the edge of…

Machine Learning · Computer Science 2024-09-25 Marco Palena , Tania Cerquitelli , Carla Fabiana Chiasserini
‹ Prev 1 2 3 10 Next ›