English
Related papers

Related papers: MNN: A Universal and Efficient Inference Engine

200 papers

Deep learning models are being deployed in many mobile intelligent applications. End-side services, such as intelligent personal assistants, autonomous cars, and smart home services often employ either simple local models on the mobile or…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-06 Amir Erfan Eshratifar , Mohammad Saeed Abrishami , Massoud Pedram

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges…

Machine Learning · Computer Science 2020-04-24 Wei Niu , Pu Zhao , Zheng Zhan , Xue Lin , Yanzhi Wang , Bin Ren

Large language models (LLMs) have demonstrated exceptional performance across a variety of tasks. However, their substantial scale leads to significant computational resource consumption during inference, resulting in high costs.…

Machine Learning · Computer Science 2025-06-13 Zhaode Wang , Jingbang Yang , Xinyu Qian , Shiwen Xing , Xiaotang Jiang , Chengfei Lv , Shengyu Zhang

Running deep neural network (DNN) inference on mobile devices, i.e., mobile inference, has become a growing trend, making inference less dependent on network connections and keeping private data locally. The prior studies on optimizing DNNs…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-04 Luting Yang , Bingqian Lu , Shaolei Ren

Deep Neural Networks are allowing mobile devices to incorporate a wide range of features into user applications. However, the computational complexity of these models makes it difficult to run them effectively on resource-constrained mobile…

Performance · Computer Science 2020-04-02 Samuel S. Ogden , Tian Guo

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition. However, their superior performance comes at the…

Machine Learning · Computer Science 2022-04-26 Han Cai , Ji Lin , Yujun Lin , Zhijian Liu , Haotian Tang , Hanrui Wang , Ligeng Zhu , Song Han

In this paper, we explore optimizations to run Recurrent Neural Network (RNN) models locally on mobile devices. RNN models are widely used for Natural Language Processing, Machine Translation, and other tasks. However, existing mobile…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-06-06 Qingqing Cao , Niranjan Balasubramanian , Aruna Balasubramanian

Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc. However,…

Machine Learning · Computer Science 2022-08-29 Xiaofan Zhang , Yao Chen , Cong Hao , Sitao Huang , Yuhong Li , Deming Chen

It is always well believed that Binary Neural Networks (BNNs) could drastically accelerate the inference efficiency by replacing the arithmetic operations in float-valued Deep Neural Networks (DNNs) with bit-wise operations. Nevertheless,…

Computer Vision and Pattern Recognition · Computer Science 2019-08-19 Jianhao Zhang , Yingwei Pan , Ting Yao , He Zhao , Tao Mei

Deep Neural Networks (DNNs) are increasingly deployed across diverse industries, driving demand for mobile device support. However, existing mobile inference frameworks often rely on a single processor per model, limiting hardware…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-28 Yunquan Gao , Zhiguo Zhang , Praveen Kumar Donta , Chinmaya Kumar Dehury , Xiujun Wang , Dusit Niyato , Qiyang Zhang

With the rapid emergence of a spectrum of high-end mobile devices, many applications that required desktop-level computation capability formerly can now run on these devices without any problem. However, without a careful optimization,…

Machine Learning · Computer Science 2019-05-03 Wei Niu , Xiaolong Ma , Yanzhi Wang , Bin Ren

Deploying deep neural networks (DNNs) on resource-constrained mobile devices presents significant challenges, particularly in achieving real-time performance while simultaneously coping with limited computational resources and battery life.…

Networking and Internet Architecture · Computer Science 2025-09-24 Zekai Sun , Xiuxian Guan , Zheng Lin , Zihan Fang , Xiangming Cai , Zhe Chen , Fangming Liu , Heming Cui , Jie Xiong , Wei Ni , Chau Yuen

With smartphones' omnipresence in people's pockets, Machine Learning (ML) on mobile is gaining traction as devices become more powerful. With applications ranging from visual filters to voice assistants, intelligence on mobile comes in many…

Machine Learning · Computer Science 2021-09-30 Mario Almeida , Stefanos Laskaridis , Abhinav Mehrotra , Lukasz Dudziak , Ilias Leontiadis , Nicholas D. Lane

While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches. These approaches aim for a carefully…

Ensembles of Deep Neural Networks (DNNs) have achieved qualitative predictions but they are computing and memory intensive. Therefore, the demand is growing to make them answer a heavy workload of requests with available computational…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-31 Pierrick Pochelu , Serge G. Petiton , Bruno Conche

Deep neural networks (DNNs) have been increasingly deployed on and integrated with edge devices, such as mobile phones, drones, robots and wearables. To run DNN inference directly on edge devices (a.k.a. edge inference) with a satisfactory…

Machine Learning · Computer Science 2020-09-18 Bingqian Lu , Jianyi Yang , Shaolei Ren

With the rapid development of Deep Learning, more and more applications on the cloud and edge tend to utilize large DNN (Deep Neural Network) models for improved task execution efficiency as well as decision-making quality. Due to memory…

Machine Learning · Computer Science 2024-07-02 Jingran Shen , Nikos Tziritas , Georgios Theodoropoulos

With the emergence of a spectrum of high-end mobile devices, many applications that formerly required desktop-level computation capability are being transferred to these devices. However, executing the inference of Deep Neural Networks…

Machine Learning · Computer Science 2020-01-23 Wei Niu , Xiaolong Ma , Sheng Lin , Shihao Wang , Xuehai Qian , Xue Lin , Yanzhi Wang , Bin Ren

Modern mobile applications are benefiting significantly from the advancement in deep learning, e.g., implementing real-time image recognition and conversational system. Given a trained deep learning model, applications usually need to…

Performance · Computer Science 2019-03-01 Tian Guo

As the backbone technology of machine learning, deep neural networks (DNNs) have have quickly ascended to the spotlight. Running DNNs on resource-constrained mobile devices is, however, by no means trivial, since it incurs high performance…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-12-31 En Li , Zhi Zhou , Xu Chen
‹ Prev 1 2 3 10 Next ›