English
Related papers

Related papers: Improving Memory Utilization in Convolutional Neur…

200 papers

"Lightweight convolutional neural networks" is an important research topic in the field of embedded vision. To implement image recognition tasks on a resource-limited hardware platform, it is necessary to reduce the memory size and the…

Computer Vision and Pattern Recognition · Computer Science 2021-04-30 Tse-Wei Chen , Motoki Yoshinaga , Hongxing Gao , Wei Tao , Dongchao Wen , Junjie Liu , Kinya Osa , Masami Kato

Processing in-memory (PIM) is promising to accelerate neural networks (NNs) because it minimizes data movement and provides large computational parallelism. Similar to machine learning accelerators, application mapping, which determines the…

Hardware Architecture · Computer Science 2024-07-02 Xuan Wang , Minxuan Zhou , Tajana Rosing

Machine intelligence, especially using convolutional neural networks (CNNs), has become a large area of research over the past years. Increasingly sophisticated hardware accelerators are proposed that exploit e.g. the sparsity in…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-23 Andreas Bytyn , René Ahlsdorf , Rainer Leupers , Gerd Ascheid

Existing deep convolutional neural networks (CNNs) generate massive interlayer feature data during network inference. To maintain real-time processing in embedded systems, large on-chip memory is required to buffer the interlayer feature…

Hardware Architecture · Computer Science 2021-10-13 Zhuang Shao , Xiaoliang Chen , Li Du , Lei Chen , Yuan Du , Wei Zhuang , Huadong Wei , Chenjia Xie , Zhongfeng Wang

Leveraging large data sets, deep Convolutional Neural Networks (CNNs) achieve state-of-the-art recognition accuracy. Due to the substantial compute and memory operations, however, they require significant execution time. The massive…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-10-13 Chao Li , Yi Yang , Min Feng , Srimat Chakradhar , Huiyang Zhou

Deformable convolutional networks have demonstrated outstanding performance in object recognition tasks with an effective feature extraction. Unlike standard convolution, the deformable convolution decides the receptive field size using…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-06-16 Saehyun Ahn , Jung-Woo Chang , Suk-Ju Kang

The deep learning revolution brought us an extensive array of neural network architectures that achieve state-of-the-art performance in a wide variety of Computer Vision tasks including among others, classification, detection and…

Computer Vision and Pattern Recognition · Computer Science 2019-03-28 Georgios Georgiadis

After the tremendous success of convolutional neural networks in image classification, object detection, speech recognition, etc., there is now rising demand for deployment of these compute-intensive ML models on tightly power constrained…

Computer Vision and Pattern Recognition · Computer Science 2019-03-05 Lukas Cavigelli , Luca Benini

Most of the existing work on FPGA acceleration of Convolutional Neural Network (CNN) focus on employing a single strategy (algorithm, dataflow, etc.) across all the layers. Such an approach does not achieve optimal latency on complex and…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-03-16 Yuan Meng , Sanmukh Kuppannagari , Rajgopal Kannan , Viktor Prasanna

Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference accelerator depend on many factors, including a weight quantization strategy (i.e., data types and bit-widths) and mapping (i.e.,…

Hardware Architecture · Computer Science 2025-07-23 Jan Klhufek , Miroslav Safar , Vojtech Mrazek , Zdenek Vasicek , Lukas Sekanina

Deploying deep Convolutional Neural Networks (CNNs) is impacted by their memory footprint and speed requirements, which mainly come from convolution. Widely-used convolution algorithms, im2col and MEC, produce a lowered matrix from an…

Computer Vision and Pattern Recognition · Computer Science 2021-04-20 Hossam Amer , Ahmed H. Salamah , Ahmad Sajedi , En-hui Yang

Photonic technologies have shown a promising way to build high-speed and high-energy-efficiency neural network accelerators. In previously presented photonic neural networks, architectures are mainly designed for fully-connected layers.…

Signal Processing · Electrical Eng. & Systems 2020-03-02 Shaofu Xu , Jing Wang , Weiwen Zou

Recent advances in deep learning have improved 3D point cloud registration but increased graphics processing unit (GPU) memory usage, often requiring preliminary sampling that reduces accuracy. We propose an overlapping region sampling…

Computer Vision and Pattern Recognition · Computer Science 2024-10-30 Tomoyasu Shimada , Kazuhiko Murasaki , Shogo Sato , Toshihiko Nishimura , Taiga Yoshida , Ryuichi Tanida

Convolution is a fundamental operation in many applications, such as computer vision, natural language processing, image processing, etc. Recent successes of convolutional neural networks in various deep learning applications put even…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-05-31 Xiaoming Chen , Jianxu Chen , Danny Z. Chen , Xiaobo Sharon Hu

The scaling of neural networks with increasing data and model sizes necessitates the development of more efficient deep learning algorithms. A significant challenge in neural network training is the memory footprint associated with…

Machine Learning · Computer Science 2024-10-08 Georgii Novikov , Ivan Oseledets

Convolutional neural network (CNN) accelerators are being widely used for their efficiency, but they require a large amount of memory, leading to the use of a slow and power consuming external memory. This paper exploits two schemes to…

Hardware Architecture · Computer Science 2022-12-23 Hyeong-Ju Kang

Training convolutional neural network models is memory intensive since back-propagation requires storing activations of all intermediate layers. This presents a practical concern when seeking to deploy very deep architectures in production,…

Machine Learning · Computer Science 2019-10-30 Ayan Chakrabarti , Benjamin Moseley

To achieve high accuracy, convolutional neural networks (CNNs) are increasingly growing in complexity and diversity in layer types and topologies. This makes it very challenging to efficiently deploy such networks on custom processor…

Systems and Control · Electrical Eng. & Systems 2024-06-21 Steven Colleman , Man Shi , Marian Verhelst

The excellent performance of deep neural networks has enabled us to solve several automatization problems, opening an era of autonomous devices. However, current deep net architectures are heavy with millions of parameters and require…

Computer Vision and Pattern Recognition · Computer Science 2018-07-06 Dat Thanh Tran , Alexandros Iosifidis , Moncef Gabbouj

The ever-growing scale of deep neural networks (DNNs) has lead to an equally rapid growth in computational resource requirements. Many recent architectures, most prominently Large Language Models, have to be trained using supercomputers…

Machine Learning · Computer Science 2024-09-19 Daniel Barley , Holger Fröning
‹ Prev 1 2 3 10 Next ›