English
Related papers

Related papers: Revisiting "Qualitatively Characterizing Neural Ne…

200 papers

The optimization of deep neural networks can be more challenging than traditional convex optimization problems due to the highly non-convex nature of the loss function, e.g. it can involve pathological landscapes such as saddle-surfaces…

Machine Learning · Computer Science 2016-08-18 Caglar Gulcehre , Marcin Moczulski , Francesco Visin , Yoshua Bengio

Deep learning, in the form of artificial neural networks, has achieved remarkable practical success in recent years, for a variety of difficult machine learning applications. However, a theoretical explanation for this remains a major open…

Machine Learning · Computer Science 2016-06-15 Itay Safran , Ohad Shamir

Training neural networks involves finding minima of a high-dimensional non-convex loss function. Knowledge of the structure of this energy landscape is sparse. Relaxing from linear interpolations, we construct continuous paths between…

Machine Learning · Statistics 2019-02-25 Felix Draxler , Kambis Veschgini , Manfred Salmhofer , Fred A. Hamprecht

Learning to Optimize is a recently proposed framework for learning optimization algorithms using reinforcement learning. In this paper, we explore learning an optimization algorithm for training shallow neural nets. Such high-dimensional…

Machine Learning · Computer Science 2017-12-01 Ke Li , Jitendra Malik

Neural Networks are function approximators that have achieved state-of-the-art accuracy in numerous machine learning tasks. In spite of their great success in terms of accuracy, their large training time makes it difficult to use them for…

Machine Learning · Computer Science 2017-04-18 Abhishek Sinha , Mausoom Sarkar , Aahitagni Mukherjee , Balaji Krishnamurthy

Measuring Efficiency in neural network system development is an open research problem. This paper presents an experimental framework to measure the training efficiency of a neural architecture. To demonstrate our approach, we analyze the…

Machine Learning · Computer Science 2024-09-13 Eduardo Cueto-Mendoza , John D. Kelleher

In this paper we develop a new perspective on generalization of neural networks by proposing and investigating the concept of a neural network stiffness. We measure how stiff a network is by looking at how a small gradient step in the…

Machine Learning · Computer Science 2020-03-17 Stanislav Fort , Paweł Krzysztof Nowak , Stanislaw Jastrzebski , Srini Narayanan

Neural networks trained via gradient descent with random initialization and without any regularization enjoy good generalization performance in practice despite being highly overparametrized. A promising direction to explain this phenomenon…

Machine Learning · Computer Science 2022-05-17 Hancheng Min , Salma Tarmoun , Rene Vidal , Enrique Mallada

Training neural networks involves solving large-scale non-convex optimization problems. This task has long been believed to be extremely difficult, with fear of local minima and other obstacles motivating a variety of schemes to improve…

Neural and Evolutionary Computing · Computer Science 2015-05-25 Ian J. Goodfellow , Oriol Vinyals , Andrew M. Saxe

Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization…

Computer Vision and Pattern Recognition · Computer Science 2021-03-12 Theodoros Georgiou , Sebastian Schmitt , Thomas Bäck , Wei Chen , Michael Lew

Many meta-learning approaches for few-shot learning rely on simple base learners such as nearest-neighbor classifiers. However, even in the few-shot regime, discriminatively trained linear predictors can offer better generalization. We…

Computer Vision and Pattern Recognition · Computer Science 2019-04-24 Kwonjoon Lee , Subhransu Maji , Avinash Ravichandran , Stefano Soatto

One of the prevailing trends in the machine- and deep-learning community is to gravitate towards the use of increasingly larger models in order to keep pushing the state-of-the-art performance envelope. This tendency makes access to the…

Machine Learning · Computer Science 2023-05-29 Shadi Sartipi , Edgar A. Bernal

We study the effect of width on the dynamics of feature-learning neural networks across a variety of architectures and datasets. Early in training, wide neural networks trained on online data have not only identical loss curves but also…

Machine Learning · Computer Science 2023-12-07 Nikhil Vyas , Alexander Atanasov , Blake Bordelon , Depen Morwani , Sabarish Sainathan , Cengiz Pehlevan

Recent results show that deep neural networks achieve excellent performance even when, during training, weights are quantized and projected to a binary representation. Here, we show that this is just the tip of the iceberg: these same…

Neural and Evolutionary Computing · Computer Science 2016-06-08 Paul Merolla , Rathinakumar Appuswamy , John Arthur , Steve K. Esser , Dharmendra Modha

Recent numerical experiments have demonstrated that the choice of optimization geometry used during training can impact generalization performance when learning expressive nonlinear model classes such as deep neural networks. These…

Machine Learning · Computer Science 2022-04-25 Nicholas M. Boffi , Stephen Tu , Jean-Jacques E. Slotine

We present multi-point optimization: an optimization technique that allows to train several models simultaneously without the need to keep the parameters of each one individually. The proposed method is used for a thorough empirical…

Machine Learning · Computer Science 2025-11-18 Ivan Skorokhodov , Mikhail Burtsev

Convolutional neural networks (CNNs) have had great success in many real-world applications and have also been used to model visual processing in the brain. However, these networks are quite brittle - small changes in the input image can…

Neurons and Cognition · Quantitative Biology 2018-10-30 Brian Hu , Stefan Mihalas

State-of-the-art federated learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. For neural networks, even when centralized SGD easily finds a solution that is…

Machine Learning · Computer Science 2022-10-06 Yaodong Yu , Alexander Wei , Sai Praneeth Karimireddy , Yi Ma , Michael I. Jordan

Two major uncertainties, dataset bias and adversarial examples, prevail in state-of-the-art AI algorithms with deep neural networks. In this paper, we present an intuitive explanation for these issues as well as an interpretation of the…

Computer Vision and Pattern Recognition · Computer Science 2019-02-12 Yifei Fan , Anthony Yezzi

Understanding the optimization dynamics of neural networks is necessary for closing the gap between theory and practice. Stochastic first-order optimization algorithms are known to efficiently locate favorable minima in deep neural…

Machine Learning · Computer Science 2023-06-22 Charles Guille-Escuret , Hiroki Naganuma , Kilian Fatras , Ioannis Mitliagkas
‹ Prev 1 2 3 10 Next ›