Related papers: Mobile-Cloud Inference for Collaborative Intellige…

Shared Mobile-Cloud Inference for Collaborative Intelligence

As AI applications for mobile devices become more prevalent, there is an increasing need for faster execution and lower energy consumption for neural model inference. Historically, the models run on mobile devices have been smaller and…

Artificial Intelligence · Computer Science 2020-02-04 Mateen Ulhaq , Ivan V. Bajić

Towards Collaborative Intelligence Friendly Architectures for Deep Learning

Modern mobile devices are equipped with high-performance hardware resources such as graphics processing units (GPUs), making the end-side intelligent services more feasible. Even recently, specialized silicons as neural engines are being…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-04 Amir Erfan Eshratifar , Amirhossein Esmaili , Massoud Pedram

Cloud-based or On-device: An Empirical Study of Mobile Deep Inference

Modern mobile applications are benefiting significantly from the advancement in deep learning, e.g., implementing real-time image recognition and conversational system. Given a trained deep learning model, applications usually need to…

Performance · Computer Science 2019-03-01 Tian Guo

Multi-task learning with compressible features for Collaborative Intelligence

A promising way to deploy Artificial Intelligence (AI)-based services on mobile devices is to run a part of the AI model (a deep neural network) on the mobile itself, and the rest in the cloud. This is sometimes referred to as collaborative…

Multimedia · Computer Science 2019-05-17 Saeed Ranjbar Alvar , Ivan V. Bajić

Runtime Deep Model Multiplexing for Reduced Latency and Energy Consumption Inference

We propose a learning algorithm to design a light-weight neural multiplexer that given the input and computational resource requirements, calls the model that will consume the minimum compute resources for a successful inference. Mobile…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-09-18 Amir Erfan Eshratifar , Massoud Pedram

MDInference: Balancing Inference Accuracy and Latency for Mobile Applications

Deep Neural Networks are allowing mobile devices to incorporate a wide range of features into user applications. However, the computational complexity of these models makes it difficult to run them effectively on resource-constrained mobile…

Performance · Computer Science 2020-04-02 Samuel S. Ogden , Tian Guo

Near-Lossless Deep Feature Compression for Collaborative Intelligence

Collaborative intelligence is a new paradigm for efficient deployment of deep neural networks across the mobile-cloud infrastructure. By dividing the network between the mobile and the cloud, it is possible to distribute the computational…

Image and Video Processing · Electrical Eng. & Systems 2018-06-19 Hyomin Choi , Ivan V. Bajic

Combining Cloud and Mobile Computing for Machine Learning

Although the computing power of mobile devices is increasing, machine learning models are also growing in size. This trend creates problems for mobile devices due to limitations like their memory capacity and battery life. While many…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-02-27 Ruiqi Xu , Tianchi Zhang

Collaborative Inference for AI-Empowered IoT Devices

Artificial intelligence (AI) technologies, and particularly deep learning systems, are traditionally the domain of large-scale cloud servers, which have access to high computational and energy resources. Nonetheless, in Internet-of-Things…

Signal Processing · Electrical Eng. & Systems 2022-07-26 Nir Shlezinger , Ivan V. Bajic

Privacy-Preserving Deep Inference for Rich User Data on The Cloud

Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator…

Computer Vision and Pattern Recognition · Computer Science 2017-10-13 Seyed Ali Osia , Ali Shahin Shamsabadi , Ali Taheri , Kleomenis Katevas , Hamid R. Rabiee , Nicholas D. Lane , Hamed Haddadi

JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services

Deep learning models are being deployed in many mobile intelligent applications. End-side services, such as intelligent personal assistants, autonomous cars, and smart home services often employ either simple local models on the mobile or…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-06 Amir Erfan Eshratifar , Mohammad Saeed Abrishami , Massoud Pedram

Not Just Privacy: Improving Performance of Private Deep Learning in Mobile Cloud

The increasing demand for on-device deep learning services calls for a highly efficient manner to deploy deep neural networks (DNNs) on mobile devices with limited capacity. The cloud-based solution is a promising approach to enabling deep…

Machine Learning · Computer Science 2019-01-08 Ji Wang , Jianguo Zhang , Weidong Bao , Xiaomin Zhu , Bokai Cao , Philip S. Yu

PrivaScissors: Enhance the Privacy of Collaborative Inference through the Lens of Mutual Information

Edge-cloud collaborative inference empowers resource-limited IoT devices to support deep learning applications without disclosing their raw data to the cloud server, thus preserving privacy. Nevertheless, prior research has shown that…

Cryptography and Security · Computer Science 2023-06-16 Lin Duan , Jingwei Sun , Yiran Chen , Maria Gorlatova

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge

Recently, deep neural networks (DNNs) have been widely applied in mobile intelligent applications. The inference for the DNNs is usually performed in the cloud. However, it leads to a large overhead of transmitting data via wireless…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-12-19 Guangli Li , Lei Liu , Xueying Wang , Xiao Dong , Peng Zhao , Xiaobing Feng

Distributed Inference on Mobile Edge and Cloud: A Data-Cartography based Clustering Approach

The large size of DNNs poses a significant challenge for deployment on devices with limited resources, such as mobile, edge, and IoT platforms. To address this issue, a distributed inference framework can be utilized. In this framework, a…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-24 Divya Jyoti Bajpai , Manjesh Kumar Hanawal

Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference

The increasing deployment of deep neural networks (DNNs) in cyber-physical systems (CPS) enhances perception fidelity, but imposes substantial computational demands on execution platforms, posing challenges to real-time control deadlines.…

Machine Learning · Computer Science 2026-05-04 Pragya Sharma , Hang Qiu , Mani Srivastava

PriMask: Cascadable and Collusion-Resilient Data Masking for Mobile Cloud Inference

Mobile cloud offloading is indispensable for inference tasks based on large-scale deep models. However, transmitting privacy-rich inference data to the cloud incurs concerns. This paper presents the design of a system called PriMask, in…

Cryptography and Security · Computer Science 2022-11-15 Linshan Jiang , Qun Song , Rui Tan , Mo Li

Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions

The conventional cloud-based large model learning framework is increasingly constrained by latency, cost, personalization, and privacy concerns. In this survey, we explore an emerging paradigm: collaborative learning between on-device small…

Machine Learning · Computer Science 2025-04-23 Chaoyue Niu , Yucheng Ding , Junhui Lu , Zhengxiang Huang , Hang Zeng , Yutong Dai , Xuezhen Tu , Chengfei Lv , Fan Wu , Guihai Chen

Compressing Representations for Embedded Deep Learning

Despite recent advances in architectures for mobile devices, deep learning computational requirements remains prohibitive for most embedded devices. To address that issue, we envision sharing the computational costs of inference between…

Machine Learning · Computer Science 2019-11-26 Juliano S. Assine , Alan Godoy , Eduardo Valle

Calibration-Aided Edge Inference Offloading via Adaptive Model Partitioning of Deep Neural Networks

Mobile devices can offload deep neural network (DNN)-based inference to the cloud, overcoming local hardware and energy limitations. However, offloading adds communication delay, thus increasing the overall inference time, and hence it…

Machine Learning · Computer Science 2021-01-29 Roberto G. Pacheco , Rodrigo S. Couto , Osvaldo Simeone