English
Related papers

Related papers: Accelerated CNN Training Through Gradient Approxim…

200 papers

This paper aims to accelerate the test-time computation of convolutional neural networks (CNNs), especially very deep CNNs that have substantially impacted the computer vision community. Unlike previous methods that are designed for…

Computer Vision and Pattern Recognition · Computer Science 2015-11-19 Xiangyu Zhang , Jianhua Zou , Kaiming He , Jian Sun

Training deep Convolutional Neural Networks (CNN) is a time consuming task that may take weeks to complete. In this article we propose a novel, theoretically founded method for reducing CNN training time without incurring any loss in…

Computer Vision and Pattern Recognition · Computer Science 2016-10-13 Pedro Porto Buarque de Gusmão , Gianluca Francini , Skjalg Lepsøy , Enrico Magli

Synchronized stochastic gradient descent (SGD) optimizers with data parallelism are widely used in training large-scale deep neural networks. Although using larger mini-batch sizes can improve the system scalability by reducing the…

This paper aims to accelerate the test-time computation of deep convolutional neural networks (CNNs). Unlike existing methods that are designed for approximating linear filters or linear responses, our method takes the nonlinear units into…

Computer Vision and Pattern Recognition · Computer Science 2014-11-18 Xiangyu Zhang , Jianhua Zou , Xiang Ming , Kaiming He , Jian Sun

Traditional CNN models are trained and tested on relatively low resolution images (<300 px), and cannot be directly operated on large-scale images due to compute and memory constraints. We propose Patch Gradient Descent (PatchGD), an…

Computer Vision and Pattern Recognition · Computer Science 2023-02-01 Deepak K. Gupta , Gowreesh Mago , Arnav Chavan , Dilip K. Prasad

This paper is focused on the improvement the efficiency of the sparse convolutional neural networks (CNNs) layers on graphic processing units (GPU). The Nvidia deep neural network (cuDnn) library provides the most effective implementation…

Machine Learning · Computer Science 2022-01-03 Marcin Pietroń , Dominik Żurek

We propose a novel technique for faster deep neural network training which systematically applies sample-based approximation to the constituent tensor operations, i.e., matrix multiplications and convolutions. We introduce new sampling…

Machine Learning · Computer Science 2021-10-27 Menachem Adelman , Kfir Y. Levy , Ido Hakimi , Mark Silberstein

Approximate computing methods have shown great potential for deep learning. Due to the reduced hardware costs, these methods are especially suitable for inference tasks on battery-operated devices that are constrained by their power budget.…

Machine Learning · Computer Science 2023-04-11 Tianmu Li , Shurui Li , Puneet Gupta

Convolutional Neural Networks (CNNs) has revolutionized computer vision, but training very deep networks has been challenging due to the vanishing gradient problem. This paper explores Residual Networks (ResNet), introduced by He et al.…

Computer Vision and Pattern Recognition · Computer Science 2025-10-29 Xingyu Liu , Kun Ming Goh

Improving performance of deep learning models and reducing their training times are ongoing challenges in deep neural networks. There are several approaches proposed to address these challenges one of which is to increase the depth of the…

Machine Learning · Computer Science 2020-06-20 Sunitha Basodi , Chunyan Ji , Haiping Zhang , Yi Pan

Convolutional neural networks (CNN) have achieved major breakthroughs in recent years. Their performance in computer vision have matched and in some areas even surpassed human capabilities. Deep neural networks can capture complex…

Computer Vision and Pattern Recognition · Computer Science 2016-05-23 Philipp Gysel

Typically, Ultra-deep neural network(UDNN) tends to yield high-quality model, but its training process is usually resource intensive and time-consuming. Modern GPU's scarce DRAM capacity is the primary bottleneck that hinders the…

Machine Learning · Computer Science 2019-06-21 Jinrong Guo , Wantao Liu , Wang Wang , Qu Lu , Songlin Hu , Jizhong Han , Ruixuan Li

Long training times for high-accuracy deep neural networks (DNNs) impede research into new DNN architectures and slow the development of high-accuracy DNNs. In this paper we present FireCaffe, which successfully scales deep neural network…

Computer Vision and Pattern Recognition · Computer Science 2016-01-11 Forrest N. Iandola , Khalid Ashraf , Matthew W. Moskewicz , Kurt Keutzer

Deep learning has led to tremendous advancements in the field of Artificial Intelligence. One caveat however is the substantial amount of compute needed to train these deep learning models. Training a benchmark dataset like ImageNet on a…

Machine Learning · Computer Science 2018-10-30 Karanbir Chahal , Manraj Singh Grover , Kuntal Dey

Graph Neural Networks (GNNs) are powerful deep learning models to generate node embeddings on graphs. When applying deep GNNs on large graphs, it is still challenging to perform training in an efficient and scalable way. We propose a novel…

Machine Learning · Computer Science 2020-10-08 Hanqing Zeng , Hongkuan Zhou , Ajitesh Srivastava , Rajgopal Kannan , Viktor Prasanna

We present the remote stochastic gradient (RSG) method, which computes the gradients at configurable remote observation points, in order to improve the convergence rate and suppress gradient noise at the same time for different curvatures.…

Machine Learning · Computer Science 2020-09-08 Yushu Chen , Hao Jing , Wenlai Zhao , Zhiqiang Liu , Ouyi Li , Liang Qiao , Wei Xue , Guangwen Yang

Machine-learning architectures, such as Convolutional Neural Networks (CNNs) are vulnerable to adversarial attacks: inputs crafted carefully to force the system output to a wrong label. Since machine-learning is being deployed in…

Cryptography and Security · Computer Science 2022-11-03 Amira Guesmi , Ihsen Alouani , Khaled N. Khasawneh , Mouna Baklouti , Tarek Frikha , Mohamed Abid , Nael Abu-Ghazaleh

The speed of deep neural networks training has become a big bottleneck of deep learning research and development. For example, training GoogleNet by ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-08-11 Yang You , Aydin Buluc , James Demmel

The creation of practical deep learning data-products often requires parallelization across processors and computers to make deep learning feasible on large data sets, but bottlenecks in communication bandwidth make it difficult to attain…

Neural and Evolutionary Computing · Computer Science 2016-02-22 Tim Dettmers

Energy efficiency of hardware accelerators of deep neural networks (DNN) can be improved by introducing approximate arithmetic circuits. In order to quantify the error introduced by using these circuits and avoid the expensive hardware…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-03 Filip Vaverka , Vojtech Mrazek , Zdenek Vasicek , Lukas Sekanina
‹ Prev 1 2 3 10 Next ›