Related papers: Inference Time Optimization Using BranchyNet Parti…

A Survey on Deep Neural Network Partition over Cloud, Edge and End Devices

Deep neural network (DNN) partition is a research problem that involves splitting a DNN into multiple parts and offloading them to specific locations. Because of the recent advancement in multi-access edge computing and edge intelligence,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-04-21 Di Xu , Xiang He , Tonghua Su , Zhongjie Wang

Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput

Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these…

Networking and Internet Architecture · Computer Science 2022-10-25 Arjun Parthasarathy , Bhaskar Krishnamachari

Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing

As a key technology of enabling Artificial Intelligence (AI) applications in 5G era, Deep Neural Networks (DNNs) have quickly attracted widespread attention. However, it is challenging to run computation-intensive DNN-based tasks on mobile…

Networking and Internet Architecture · Computer Science 2019-10-14 En Li , Liekang Zeng , Zhi Zhou , Xu Chen

Dynamic DNN Decomposition for Lossless Synergistic Inference

Deep neural networks (DNNs) sustain high performance in today's data processing applications. DNN inference is resource-intensive thus is difficult to fit into a mobile device. An alternative is to offload the DNN inference to a cloud…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-18 Beibei Zhang , Tian Xiang , Hongxuan Zhang , Te Li , Shiqiang Zhu , Jianjun Gu

Robust DNN Partitioning and Resource Allocation Under Uncertain Inference Time

In edge intelligence systems, deep neural network (DNN) partitioning and data offloading can provide real-time task inference for resource-constrained mobile devices. However, the inference time of DNNs is typically uncertain and cannot be…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-24 Zhaojun Nan , Yunchu Han , Sheng Zhou , Zhisheng Niu

Scission: Performance-driven and Context-aware Cloud-Edge Distribution of Deep Neural Networks

Partitioning and distributing deep neural networks (DNNs) across end-devices, edge resources and the cloud has a potential twofold advantage: preserving privacy of the input data, and reducing the ingress bandwidth demand beyond the edge.…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-18 Luke Lockhart , Paul Harvey , Pierre Imai , Peter Willis , Blesson Varghese

Enabling Deep Learning on Edge Devices

Deep neural networks (DNNs) have succeeded in many different perception tasks, e.g., computer vision, natural language processing, reinforcement learning, etc. The high-performed DNNs heavily rely on intensive resource consumption. For…

Machine Learning · Computer Science 2022-10-10 Zhongnan Qu

Joint Multi-User DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence

Mobile Edge Computing (MEC) has emerged as a promising supporting architecture providing a variety of resources to the network edge, thus acting as an enabler for edge intelligence services empowering massive mobile and Internet of Things…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-20 Xin Tang , Xu Chen , Liekang Zeng , Shuai Yu , Lin Chen

A Case For Adaptive Deep Neural Networks in Edge Computing

Edge computing offers an additional layer of compute infrastructure closer to the data source before raw data from privacy-sensitive and performance-critical applications is transferred to a cloud data center. Deep Neural Networks (DNNs)…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-17 Francis McNamee , Schahram Dustadar , Peter Kilpatrick , Weisong Shi , Ivor Spence , Blesson Varghese

Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing

With the edge computing becoming an increasingly adopted concept in system architectures, it is expected its utilization will be additionally heightened when combined with deep learning (DL) techniques. The idea behind integrating demanding…

Networking and Internet Architecture · Computer Science 2020-03-12 Mounir Bensalem , Jasenka Dizdarević , Admela Jukan

Partitioning and Deployment of Deep Neural Networks on Edge Clusters

Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these…

Networking and Internet Architecture · Computer Science 2023-04-25 Arjun Parthasarathy , Bhaskar Krishnamachari

Where to Split? A Pareto-Front Analysis of DNN Partitioning for Edge Inference

The deployment of deep neural networks (DNNs) on resource-constrained edge devices is frequently hindered by their significant computational and memory requirements. While partitioning and distributing a DNN across multiple devices is a…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-14 Adiba Masud , Nicholas Foley , Pragathi Durga Rajarajan , Palden Lama

Scaling Up Deep Neural Network Optimization for Edge Inference

Deep neural networks (DNNs) have been increasingly deployed on and integrated with edge devices, such as mobile phones, drones, robots and wearables. To run DNN inference directly on edge devices (a.k.a. edge inference) with a satisfactory…

Machine Learning · Computer Science 2020-09-18 Bingqian Lu , Jianyi Yang , Shaolei Ren

Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks

Mobile devices can offload deep neural network (DNN)-based inference to the cloud, overcoming local hardware and energy limitations. However, offloading adds communication delay, thus increasing the overall inference time, and hence it…

Machine Learning · Computer Science 2021-01-29 Roberto G. Pacheco , Rodrigo S. Couto , Osvaldo Simeone

Optimization Framework for Splitting DNN Inference Jobs over Computing Networks

Ubiquitous artificial intelligence (AI) is considered one of the key services in 6G systems. AI services typically rely on deep neural network (DNN) requiring heavy computation. Hence, in order to support ubiquitous AI, it is crucial to…

Networking and Internet Architecture · Computer Science 2022-07-27 Sehun Jung , Hyang-Won Lee

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy

As the backbone technology of machine learning, deep neural networks (DNNs) have have quickly ascended to the spotlight. Running DNNs on resource-constrained mobile devices is, however, by no means trivial, since it incurs high performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-12-31 En Li , Zhi Zhou , Xu Chen

Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference Services

Edge inference (EI) has emerged as a promising paradigm to address the growing limitations of cloud-based Deep Neural Network (DNN) inference services, such as high response latency, limited scalability, and severe data privacy exposure.…

Machine Learning · Computer Science 2025-05-30 Zhipeng Cheng , Xiaoyu Xia , Hong Wang , Minghui Liwang , Ning Chen , Xuwei Fan , Xianbin Wang

Distributed Deep Neural Networks over the Cloud, the Edge and End Devices

We propose distributed deep neural networks (DDNNs) over distributed computing hierarchies, consisting of the cloud, the edge (fog) and end devices. While being able to accommodate inference of a deep neural network (DNN) in the cloud, a…

Computer Vision and Pattern Recognition · Computer Science 2017-09-08 Surat Teerapittayanon , Bradley McDanel , H. T. Kung

Decentralized Low-Latency Collaborative Inference via Ensembles on the Edge

The success of deep neural networks (DNNs) is heavily dependent on computational resources. While DNNs are often employed on cloud servers, there is a growing need to operate DNNs on edge devices. Edge devices are typically limited in their…

Machine Learning · Computer Science 2022-06-08 May Malka , Erez Farhan , Hai Morgenstern , Nir Shlezinger

Edge-Host Partitioning of Deep Neural Networks with Feature Space Encoding for Resource-Constrained Internet-of-Things Platforms

This paper introduces partitioning an inference task of a deep neural network between an edge and a host platform in the IoT environment. We present a DNN as an encoding pipeline, and propose to transmit the output feature space of an…

Computer Vision and Pattern Recognition · Computer Science 2018-02-13 Jong Hwan Ko , Taesik Na , Mohammad Faisal Amir , Saibal Mukhopadhyay