Machine Learning · Computer Science
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Alexandros Kouris, Stylianos I. Venieris, Stefanos Laskaridis, Nicholas D. Lane
2023-08-08
Distributed, Parallel, and Cluster Computing · Computer Science
Task-based preemptive scheduling on FPGAs leveraging partial reconfiguration
Gabriel Rodriguez-Canal, Nick Brown, Yuri Torres, Arturo Gonzalez-Escribano
2023-01-19
Distributed, Parallel, and Cluster Computing · Computer Science
Preemption Aware Task Scheduling for Priority and Deadline Constrained DNN Inference Task Offloading in Homogeneous Mobile-Edge Networks
Jamie Cotter, Ignacio Castineiras, Donna O'Shea, Victor Cionca
2025-04-24
Machine Learning · Computer Science
Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference
Mohammad Siavashi, Faezeh Keshmiri Dindarloo, Dejan Kostic, Marco Chiesa
2025-03-13
Distributed, Parallel, and Cluster Computing · Computer Science
Automated Runtime-Aware Scheduling for Multi-Tenant DNN Inference on GPU
Fuxun Yu, Shawn Bray, Di Wang, Longfei Shangguan +3
2021-11-30
Data Structures and Algorithms · Computer Science
Online Non-preemptive Scheduling on Unrelated Machines with Rejections
Giorgio Lucarelli, Benjamin Moseley, Nguyen Kim Thang, Abhinav Srivastav +1
2018-03-01
Distributed, Parallel, and Cluster Computing · Computer Science
An efficient cloud scheduler design supporting preemptible instances
Álvaro López García, Enol Fernández-del-Castillo, Isabel Campos Plasencia
2020-01-29
Distributed, Parallel, and Cluster Computing · Computer Science
D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs
Aditya Dhakal, Sameer G. Kulkarni, K. K. Ramakrishnan
2023-04-27
Distributed, Parallel, and Cluster Computing · Computer Science
An efficient and flexible inference system for serving heterogeneous ensembles of deep neural networks
Pierrick Pochelu, Serge G. Petiton, Bruno Conche
2022-08-31
Operating Systems · Computer Science
Dynamic Ready Queue Based Process Priority Scheduling Algorithm
Raghav Dalmia, Aryaman Sinha, Ruchi Verma, P. K. Gupta
2022-05-17
Distributed, Parallel, and Cluster Computing · Computer Science
Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?
Seyed Morteza Nabavinejad, Masoumeh Ebrahimi, Sherief Reda
2023-08-29
Data Structures and Algorithms · Computer Science
Improved Approximation Algorithms for Non-Preemptive Throughput Maximization
Alexander Armbruster, Fabrizio Grandoni, Antoine Tinguely, Andreas Wiese
2026-04-01
Distributed, Parallel, and Cluster Computing · Computer Science
Energy and Time Efficient Scheduling of Tasks with Dependencies on Asymmetric Multiprocessors
Ioannis Chatzigiannakis, Georgios Giannoulis, Paul G. Spirakis
2008-06-09
Distributed, Parallel, and Cluster Computing · Computer Science
SmartPQ: An Adaptive Concurrent Priority Queue for NUMA Architectures
Christina Giannoula, Foteini Strati, Dimitrios Siakavaras, Georgios Goumas +1
2024-06-12
Distributed, Parallel, and Cluster Computing · Computer Science
Fault-tolerant parallel scheduling of arbitrary length jobs on a shared channel
Marek Klonowski, Dariusz R. Kowalski, Jarosław Mirek, Prudence W. H. Wong
2018-07-26
Hardware Architecture · Computer Science
From Principles to Practice: A Systematic Study of LLM Serving on Multi-core NPUs
Tianhao Zhu, Dahu Feng, Erhu Feng, Yubin Xia
2025-10-08
Machine Learning · Computer Science
Towards A Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms
Jingran Shen, Nikos Tziritas, Georgios Theodoropoulos
2024-07-02
Networking and Internet Architecture · Computer Science
Adaptive Scheduling for Edge-Assisted DNN Serving
Jian He, Chenxi Yang, Zhaoyuan He, Ghufran Baig +1
2023-05-04