English
Related papers

Related papers: StruM: Structured Mixed Precision for Efficient De…

200 papers

Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally…

Deep learning algorithms have shown tremendous success in many recognition tasks; however, these algorithms typically include a deep neural network (DNN) structure and a large number of parameters, which makes it challenging to implement…

Neural and Evolutionary Computing · Computer Science 2018-04-23 Shihui Yin , Gaurav Srivastava , Shreyas K. Venkataramanaiah , Chaitali Chakrabarti , Visar Berisha , Jae-sun Seo

Mixed-precision quantization is a popular approach for compressing deep neural networks (DNNs). However, it is challenging to scale the performance efficiently with mixed-precision DNNs given the current FPGA architecture and conventional…

Hardware Architecture · Computer Science 2023-11-07 Yuzong Chen , Jordan Dotzel , Mohamed S. Abdelfattah

Spiking Neural Networks (SNNs) have emerged as a promising approach to improve the energy efficiency of machine learning models, as they naturally implement event-driven computations while avoiding expensive multiplication operations. In…

Neural and Evolutionary Computing · Computer Science 2024-10-31 Anagha Nimbekar , Prabodh Katti , Chen Li , Bashir M. Al-Hashimi , Amit Acharyya , Bipin Rajendran

Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices. Although capable of reducing a reasonable amount of model…

Machine Learning · Computer Science 2021-06-17 Sheng Lin , Wei Jiang , Wei Wang , Kaidi Xu , Yanzhi Wang , Shan Liu , Songnan Li

Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by reducing the number of model parameters over the course of training. However, most weight pruning techniques generally does not…

Machine Learning · Computer Science 2022-02-03 Bradley McDanel , Helia Dinh , John Magallanes

Weight pruning methods of DNNs have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods…

Neural and Evolutionary Computing · Computer Science 2019-03-28 Tianyun Zhang , Shaokai Ye , Kaiqi Zhang , Xiaolong Ma , Ning Liu , Linfeng Zhang , Jian Tang , Kaisheng Ma , Xue Lin , Makan Fardad , Yanzhi Wang

As deep neural network (DNN) models are growing exponentially in size, their deployment on resource-constrained edge platforms is becoming increasingly challenging. In-memory-computing (IMC) with non-volatile memories (NVMs) has emerged as…

Emerging Technologies · Computer Science 2026-04-07 Imtiaz Ahmed , Sumeet Kumar Gupta

Deploying mixed-precision neural networks on edge devices is friendly to hardware resources and power consumption. To support fully mixed-precision neural network inference, it is necessary to design flexible hardware accelerators for…

Hardware Architecture · Computer Science 2025-02-04 Liang Zhao , Kunming Shao , Fengshi Tian , Tim Kwang-Ting Cheng , Chi-Ying Tsui , Yi Zou

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia. The aim of this paper is to achieve ultra-high energy efficiency and performance for hardware implementations of deep neural…

Machine Learning · Computer Science 2018-02-20 Yanzhi Wang , Caiwen Ding , Zhe Li , Geng Yuan , Siyu Liao , Xiaolong Ma , Bo Yuan , Xuehai Qian , Jian Tang , Qinru Qiu , Xue Lin

Scientific machine learning (SciML) has emerged as a versatile approach to address complex computational science and engineering problems. Within this field, physics-informed neural networks (PINNs) and deep operator networks (DeepONets)…

Machine Learning · Computer Science 2024-01-31 Joel Hayford , Jacob Goldman-Wetzler , Eric Wang , Lu Lu

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference…

Computer Vision and Pattern Recognition · Computer Science 2020-10-30 Zhuliang Yao , Shijie Cao , Wencong Xiao , Chen Zhang , Lanshun Nie

Sparse training is one of the promising techniques to reduce the computational cost of DNNs while retaining high accuracy. In particular, N:M fine-grained structured sparsity, where only N out of consecutive M elements can be nonzero, has…

Machine Learning · Computer Science 2023-09-25 Chao Fang , Wei Sun , Aojun Zhou , Zhongfeng Wang

With the widespread use of deep neural networks(DNNs) in intelligent systems, DNN accelerators with high performance and energy efficiency are greatly demanded. As one of the feasible processing-in-memory(PIM) architectures,…

Hardware Architecture · Computer Science 2023-12-22 Junpeng Wang , Mengke Ge , Bo Ding , Qi Xu , Song Chen , Yi Kang

As deep neural networks are growing in size and being increasingly deployed to more resource-limited devices, there has been a recent surge of interest in network pruning methods, which aim to remove less important weights or activations of…

Machine Learning · Computer Science 2020-06-23 Minyoung Song , Jaehong Yoon , Eunho Yang , Sung Ju Hwang

Real time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular…

Neural and Evolutionary Computing · Computer Science 2015-12-31 Sajid Anwar , Kyuyeon Hwang , Wonyong Sung

The substantial memory bandwidth and computational demands of large language models (LLMs) present critical challenges for efficient inference. To tackle this, the literature has explored heterogeneous systems that combine neural processing…

Hardware Architecture · Computer Science 2026-05-05 Yuzong Chen , Chao Fang , Xilai Dai , Yuheng Wu , Thierry Tambe , Marian Verhelst , Mohamed S. Abdelfattah

Deep Neural Networks (DNNs) have shown significant advantages in a wide variety of domains. However, DNNs are becoming computationally intensive and energy hungry at an exponential pace, while at the same time, there is a vast demand for…

Large language models (LLMs) have demonstrated exceptional proficiency in understanding and generating human language, but efficient inference on resource-constrained embedded devices remains challenging due to large model sizes and…

Hardware Architecture · Computer Science 2025-07-15 Weihong Xu , Haein Choi , Po-kai Hsu , Shimeng Yu , Tajana Rosing

We evolve PyDTNN, a framework for distributed parallel training of Deep Neural Networks (DNNs), into an efficient inference tool for convolutional neural networks. Our optimization process on multicore ARM processors involves several…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-20 Adrián Castelló , Sergio Barrachina , Manuel F. Dolz , Enrique S. Quintana-Ortí , Pau San Juan
‹ Prev 1 2 3 10 Next ›