English
Related papers

Related papers: Multi-user Co-inference with Batch Processing Capa…

200 papers

With the growing integration of artificial intelligence in mobile applications, a substantial number of deep neural network (DNN) inference requests are generated daily by mobile devices. Serving these requests presents significant…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-22 Yaodan Xu , Sheng Zhou , Zhisheng Niu

The deployment of inference services at the network edge, called edge inference, offloads computation-intensive inference tasks from mobile devices to edge servers, thereby enhancing the former's capabilities and battery lives. In a…

Information Theory · Computer Science 2023-01-02 Zhiyan Liu , Qiao Lan , Kaibin Huang

Edge computing's growing prominence, due to its ability to reduce communication latency and enable real-time processing, is promoting the rise of high-performance, heterogeneous System-on-Chip solutions. While current approaches often…

Artificial Intelligence · Computer Science 2024-09-24 Rakshith Jayanth , Neelesh Gupta , Viktor Prasanna

GPU-accelerated computing is a key technology to realize high-speed inference servers using deep neural networks (DNNs). An important characteristic of GPU-based inference is that the computational efficiency, in terms of the processing…

Performance · Computer Science 2021-01-13 Yoshiaki Inoue

We investigate the problem of computation offloading in a mobile edge computing architecture, where multiple energy-constrained users compete to offload their computational tasks to multiple servers through a shared wireless medium. We…

Information Theory · Computer Science 2019-12-24 Navid Naderializadeh , Morteza Hashemi

Edge machine learning can deliver low-latency and private artificial intelligent (AI) services for mobile devices by leveraging computation and storage resources at the network edge. This paper presents an energy-efficient edge processing…

Information Theory · Computer Science 2020-03-03 Kai Yang , Yuanming Shi , Wei Yu , Zhi Ding

Deploying deep neural networks (DNNs) on resource-constrained mobile devices presents significant challenges, particularly in achieving real-time performance while simultaneously coping with limited computational resources and battery life.…

Networking and Internet Architecture · Computer Science 2025-09-24 Zekai Sun , Xiuxian Guan , Zheng Lin , Zihan Fang , Xiangming Cai , Zhe Chen , Fangming Liu , Heming Cui , Jie Xiong , Wei Ni , Chau Yuen

Ensembles of Deep Neural Networks (DNNs) have achieved qualitative predictions but they are computing and memory intensive. Therefore, the demand is growing to make them answer a heavy workload of requests with available computational…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-31 Pierrick Pochelu , Serge G. Petiton , Bruno Conche

Discrete optimization is a central problem in artificial intelligence. The optimization of the aggregated cost of a network of cost functions arises in a variety of problems including (W)CSP, DCOP, as well as optimization in stochastic…

Artificial Intelligence · Computer Science 2018-01-12 Ferdinando Fioretto , Enrico Pontelli , William Yeoh , Rina Dechter

The proliferation of IoT devices and advancements in network technologies have intensified the demand for real-time data processing at the network edge. To address these demands, low-power AI accelerators, particularly GPUs, are…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-13 Abhinaba Chakraborty , Wouter Tavernier , Akis Kourtis , Mario Pickavet , Andreas Oikonomakis , Didier Colle

The development of mobile services has impacted a variety of computation-intensive and time-sensitive applications, such as recommendation systems and daily payment methods. However, computing task competition involving limited resources…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-12-10 Honghao Gao , Xuejie Wang , Xiaojin Ma , Wei Wei , Shahid Mumtaz

The inference of Neural Networks is usually restricted by the resources (e.g., computing power, memory, bandwidth) on edge devices. In addition to improving the hardware design and deploying efficient models, it is possible to aggregate the…

Machine Learning · Computer Science 2021-11-05 Jun-Liang Lin , Sheng-De Wang

We consider a heterogeneous network with mobile edge computing, where a user can offload its computation to one among multiple servers. In particular, we minimize the system-wide computation overhead by jointly optimizing the individual…

Networking and Internet Architecture · Computer Science 2018-03-05 Quoc-Viet Pham , Tuan LeAnh , Nguyen H. Tran , Choong Seon Hong

The success of deep neural networks (DNNs) is heavily dependent on computational resources. While DNNs are often employed on cloud servers, there is a growing need to operate DNNs on edge devices. Edge devices are typically limited in their…

Machine Learning · Computer Science 2022-06-08 May Malka , Erez Farhan , Hai Morgenstern , Nir Shlezinger

Unmanned aerial vehicles (UAVs) often collaborate by collecting and offloading sensing streams to an edge server, where a deep neural network (DNN) model performs cross-stream alignment, fusion, and inference. However, the coupling between…

Signal Processing · Electrical Eng. & Systems 2026-05-06 Yanan Du , Sai Xu , Yinbo Yu

With the rapid upsurge of deep learning tasks at the network edge, effective edge artificial intelligence (AI) inference becomes critical to provide low-latency intelligent services for mobile users via leveraging the edge computing…

Information Theory · Computer Science 2024-10-30 Xiangyu Yang , Sheng Hua , Yuanming Shi , Hao Wang , Jun Zhang , Khaled B. Letaief

Deep Neural Network (DNN) applications with edge computing presents a trade-off between responsiveness and computational resources. On one hand, edge computing can provide high responsiveness deploying computational resources close to end…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-29 Roberto G. Pacheco , Rodrigo S. Couto

Since emerging edge applications such as Internet of Things (IoT) analytics and augmented reality have tight latency constraints, hardware AI accelerators have been recently proposed to speed up deep neural network (DNN) inference run by…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-20 Qianlin Liang , Walid A. Hanafy , Ahmed Ali-Eldin , Prashant Shenoy

In 5G smart cities, edge computing is employed to provide nearby computing services for end devices, and the large-scale models (e.g., GPT and LLaMA) can be deployed at the network edge to boost the service quality. However, due to the…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-12 Zuan Xie , Yang Xu , Hongli Xu , Yunming Liao , Zhiyuan Yao

Motivated by the proliferation of Internet-of-Thing (IoT) devices and the rapid advances in the field of deep learning, there is a growing interest in pushing deep learning computations, conventionally handled by the cloud, to the edge of…

Machine Learning · Computer Science 2024-09-25 Marco Palena , Tania Cerquitelli , Carla Fabiana Chiasserini
‹ Prev 1 2 3 10 Next ›