English
Related papers

Related papers: Simple2Complex: Global Optimization by Gradient De…

200 papers

Leaping into the rapidly developing world of deep learning is an exciting and sometimes confusing adventure. All of the advice and tutorials available can be hard to organize and work through, especially when training specific models on…

Computer Vision and Pattern Recognition · Computer Science 2018-09-05 Ishtar Nyawira , Kristi Bushman

The optimization problem behind neural networks is highly non-convex. Training with stochastic gradient descent and variants requires careful parameter tuning and provides no guarantee to achieve the global optimum. In contrast we show…

Machine Learning · Computer Science 2016-10-31 Antoine Gautier , Quynh Nguyen , Matthias Hein

Artificial Intelligence algorithms have been steadily increasing in popularity and usage. Deep Learning, allows neural networks to be trained using huge datasets and also removes the need for human extracted features, as it automates the…

Neural and Evolutionary Computing · Computer Science 2020-05-11 Vasco Lopes , Paulo Fazendeiro

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Deep learning has excelled in image recognition tasks through neural networks inspired by the human brain. However, the necessity for large models to improve prediction accuracy introduces significant computational demands and extended…

Computer Vision and Pattern Recognition · Computer Science 2024-08-27 Taigo Sakai , Kazuhiro Hotta

We develop a new method for regularising neural networks. We learn a probability distribution over the activations of all layers of the model and then insert imputed values into the network during training. We obtain a posterior for an…

Machine Learning · Computer Science 2019-10-14 Matthew Willetts , Alexander Camuto , Stephen Roberts , Chris Holmes

Supervised training of neural networks for classification is typically performed with a global loss function. The loss function provides a gradient for the output layer, and this gradient is back-propagated to hidden layers to dictate an…

Machine Learning · Statistics 2019-05-09 Arild Nøkland , Lars Hiller Eidnes

A sequential training method for large-scale feedforward neural networks is presented. Each layer of the neural network is decoupled and trained separately. After the training is completed for each layer, they are combined together. The…

Machine Learning · Computer Science 2019-05-21 Jongrae Kim

Deep network pruning is an effective method to reduce the storage and computation cost of deep neural networks when applying them to resource-limited devices. Among many pruning granularities, neuron level pruning will remove redundant…

Computer Vision and Pattern Recognition · Computer Science 2017-03-30 Zhengtao Wang , Ce Zhu , Zhiqiang Xia , Qi Guo , Yipeng Liu

The inference structures and computational complexity of existing deep neural networks, once trained, are fixed and remain the same for all test images. However, in practice, it is highly desirable to establish a progressive structure for…

Computer Vision and Pattern Recognition · Computer Science 2018-04-27 Zhi Zhang , Guanghan Ning , Yigang Cen , Yang Li , Zhiqun Zhao , Hao Sun , Zhihai He

Deep learning has achieved state-of-the-art accuracies on several computer vision tasks. However, the computational and energy requirements associated with training such deep neural networks can be quite high. In this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2020-03-04 Aosong Feng , Priyadarshini Panda

For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates…

Machine Learning · Computer Science 2016-02-29 Zhouhan Lin , Matthieu Courbariaux , Roland Memisevic , Yoshua Bengio

We present a novel training method for deep operator networks (DeepONets), one of the most popular neural network models for operators. DeepONets are constructed by two sub-networks, namely the branch and trunk networks. Typically, the two…

Numerical Analysis · Mathematics 2023-09-06 Sanghyun Lee , Yeonjong Shin

We propose a new technique that boosts the convergence of training generative adversarial networks. Generally, the rate of training deep models reduces severely after multiple iterations. A key reason for this phenomenon is that a deep…

Machine Learning · Statistics 2018-06-15 Atsushi Nitanda , Taiji Suzuki

Scaling model capacity has been vital in the success of deep learning. For a typical network, necessary compute resources and training time grow dramatically with model size. Conditional computation is a promising way to increase the number…

Machine Learning · Computer Science 2018-11-14 Louis Kirsch , Julius Kunze , David Barber

Deep learning models offer superior performance compared to other machine learning techniques for a variety of tasks and domains, but pose their own challenges. In particular, deep learning models require larger training times as the depth…

Machine Learning · Computer Science 2023-05-31 Sunitha Basodi , Krishna Pusuluri , Xueli Xiao , Yi Pan

Recently, neural network architectures have been developed to accommodate when the data has the structure of a graph or, more generally, a hypergraph. While useful, graph structures can be potentially limiting. Hypergraph structures in…

Algebraic Topology · Mathematics 2020-12-14 Eric Bunch , Qian You , Glenn Fung , Vikas Singh

We propose a novel explanation method that explains the decisions of a deep neural network by investigating how the intermediate representations at each layer of the deep network were refined during the training process. This way we can a)…

Machine Learning · Computer Science 2021-09-14 Lukas Pfahler , Katharina Morik

It is often the case that the performance of a neural network can be improved by adding layers. In real-world practices, we always train dozens of neural network architectures in parallel which is a wasteful process. We explored $CompNet$,…

Neural and Evolutionary Computing · Computer Science 2018-04-30 Jun Lu , Wei Ma , Boi Faltings

Traditional deep network training methods optimize a monolithic objective function jointly for all the components. This can lead to various inefficiencies in terms of potential parallelization. Local learning is an approach to…

Machine Learning · Computer Science 2023-01-19 Adeetya Patel , Michael Eickenberg , Eugene Belilovsky
‹ Prev 1 2 3 10 Next ›