English
Related papers

Related papers: Thick-Net: Parallel Network Structure for Sequenti…

200 papers

Recurrent neural networks have achieved great success in many NLP tasks. However, they have difficulty in parallelization because of the recurrent structure, so it takes much time to train RNNs. In this paper, we introduce sliced recurrent…

Computation and Language · Computer Science 2018-07-09 Zeping Yu , Gongshen Liu

The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks. Current approaches parallelize training onto multiple devices by applying a single parallelization strategy (e.g.,…

Machine Learning · Computer Science 2018-06-12 Zhihao Jia , Sina Lin , Charles R. Qi , Alex Aiken

In this article, we take one step toward understanding the learning behavior of deep residual networks, and supporting the observation that deep residual networks behave like ensembles. We propose a new convolutional neural network…

Computer Vision and Pattern Recognition · Computer Science 2024-10-30 Masoud Abdi , Saeid Nahavandi

Deep learning models trained on large data sets have been widely successful in both vision and language domains. As state-of-the-art deep learning architectures have continued to grow in parameter count so have the compute budgets and times…

Deep neural networks demonstrate to have a high performance on image classification tasks while being more difficult to train. Due to the complexity and vanishing gradient problem, it normally takes a lot of time and more computational…

Computer Vision and Pattern Recognition · Computer Science 2018-05-02 Mohammad Sadegh Ebrahimi , Hossein Karkeh Abadi

Residual neural networks (ResNets) are a promising class of deep neural networks that have shown excellent performance for a number of learning tasks, e.g., image classification and recognition. Mathematically, ResNet architectures can be…

Optimization and Control · Mathematics 2019-07-26 S. Günther , L. Ruthotto , J. B. Schroder , E. C. Cyr , N. R. Gauger

Recurrent neural networks are a powerful tool for modeling sequential data, but the dependence of each timestep's computation on the previous timestep's output limits parallelism and makes RNNs unwieldy for very long sequences. We introduce…

Neural and Evolutionary Computing · Computer Science 2016-11-22 James Bradbury , Stephen Merity , Caiming Xiong , Richard Socher

Recurrent Neural Network (RNN) has been widely applied for sequence modeling. In RNN, the hidden states at current step are full connected to those at previous step, thus the influence from less related features at previous step may…

Computation and Language · Computer Science 2017-05-04 Danhao Zhu , Si Shen , Xin-Yu Dai , Jiajun Chen

The increasing complexity of modern deep neural network models and the expanding sizes of datasets necessitate the development of optimized and scalable training methods. In this white paper, we addressed the challenge of efficiently…

Machine Learning · Computer Science 2024-04-29 Raphael Ruschel , A. S. M. Iftekhar , B. S. Manjunath , Suya You

Scaling CNN training is necessary to keep up with growing datasets and reduce training time. We also see an emerging need to handle datasets with very large samples, where memory requirements for training are large. Existing training…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-03-18 Nikoli Dryden , Naoya Maruyama , Tom Benson , Tim Moon , Marc Snir , Brian Van Essen

It is often the case that the performance of a neural network can be improved by adding layers. In real-world practices, we always train dozens of neural network architectures in parallel which is a wasteful process. We explored $CompNet$,…

Neural and Evolutionary Computing · Computer Science 2018-04-30 Jun Lu , Wei Ma , Boi Faltings

Deep neural networks have a good success record and are thus viewed as the best architecture choice for complex applications. Their main shortcoming has been, for a long time, the vanishing gradient which prevented the numerical…

Machine Learning · Computer Science 2024-05-02 Bernhard Bermeitinger , Tomas Hrycej , Siegfried Handschuh

A longstanding challenge for the Machine Learning community is the one of developing models that are capable of processing and learning from very long sequences of data. The outstanding results of Transformers-based networks (e.g., Large…

Machine Learning · Computer Science 2024-02-15 Matteo Tiezzi , Michele Casoni , Alessandro Betti , Tommaso Guidi , Marco Gori , Stefano Melacci

Recurrent Neural Networks (RNNs) have been proven to be effective in modeling sequential data and they have been applied to boost a variety of tasks such as document classification, speech recognition and machine translation. Most of…

Computation and Language · Computer Science 2018-08-21 Zhiwei Wang , Yao Ma , Dawei Yin , Jiliang Tang

Over the long history of machine learning, which dates back several decades, recurrent neural networks (RNNs) have been used mainly for sequential data and time series and generally with 1D information. Even in some rare studies on 2D…

Computer Vision and Pattern Recognition · Computer Science 2021-03-05 Nguyen Huu Phong , Bernardete Ribeiro

Deep neural networks are widely used prediction algorithms whose performance often improves as the number of weights increases, leading to over-parametrization. We consider a two-layered neural network whose first layer is frozen while the…

Machine Learning · Computer Science 2023-04-10 Roman Worschech , Bernd Rosenow

Neural network models and deep models are one of the leading and state of the art models in machine learning. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such…

Machine Learning · Computer Science 2018-07-17 Soufiane Belharbi

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection…

Computation and Language · Computer Science 2023-08-10 Yutao Sun , Li Dong , Shaohan Huang , Shuming Ma , Yuqing Xia , Jilong Xue , Jianyong Wang , Furu Wei

In comparison to classical shallow representation learning techniques, deep neural networks have achieved superior performance in nearly every application benchmark. But despite their clear empirical advantages, it is still not well…

Machine Learning · Computer Science 2022-01-11 Calvin Murdock , George Cazenavette , Simon Lucey

Neural networks are known to give better performance with increased depth due to their ability to learn more abstract features. Although the deepening of networks has been well established, there is still room for efficient feature…

Computer Vision and Pattern Recognition · Computer Science 2023-03-01 Dumindu Tissera , Rukshan Wijessinghe , Kasun Vithanage , Alex Xavier , Subha Fernando , Ranga Rodrigo
‹ Prev 1 2 3 10 Next ›