Related papers: Learning Neural Network Subspaces

Training Deep Neural Networks by optimizing over nonlocal paths in hyperparameter space

Hyperparameter optimization is both a practical issue and an interesting theoretical problem in training of deep architectures. Despite many recent advances the most commonly used methods almost universally involve training multiple and…

Machine Learning · Computer Science 2019-09-10 Vlad Pushkarov , Jonathan Efroni , Mykola Maksymenko , Maciej Koch-Janusz

Convergent Learning: Do different neural networks learn the same representations?

Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by…

Machine Learning · Computer Science 2016-03-01 Yixuan Li , Jason Yosinski , Jeff Clune , Hod Lipson , John Hopcroft

Loss Patterns of Neural Networks

We present multi-point optimization: an optimization technique that allows to train several models simultaneously without the need to keep the parameters of each one individually. The proposed method is used for a thorough empirical…

Machine Learning · Computer Science 2025-11-18 Ivan Skorokhodov , Mikhail Burtsev

The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks

Neural networks have seen an explosion of usage and research in the past decade, particularly within the domains of computer vision and natural language processing. However, only recently have advancements in neural networks yielded…

Machine Learning · Computer Science 2022-07-20 Jacob Renn , Ian Sotnek , Benjamin Harvey , Brian Caffo

Precision Machine Learning

We explore unique considerations involved in fitting ML models to data with very high precision, as is often required for science applications. We empirically compare various function approximation methods and study how they scale with…

Machine Learning · Computer Science 2023-02-01 Eric J. Michaud , Ziming Liu , Max Tegmark

Qualitatively characterizing neural network optimization problems

Training neural networks involves solving large-scale non-convex optimization problems. This task has long been believed to be extremely difficult, with fear of local minima and other obstacles motivating a variety of schemes to improve…

Neural and Evolutionary Computing · Computer Science 2015-05-25 Ian J. Goodfellow , Oriol Vinyals , Andrew M. Saxe

Classifying the classifier: dissecting the weight space of neural networks

This paper presents an empirical study on the weights of neural networks, where we interpret each model as a point in a high-dimensional space -- the neural weight space. To explore the complex structure of this space, we sample from a…

Computer Vision and Pattern Recognition · Computer Science 2020-02-14 Gabriel Eilertsen , Daniel Jönsson , Timo Ropinski , Jonas Unger , Anders Ynnerman

Neural Networks Designing Neural Networks: Multi-Objective Hyper-Parameter Optimization

Artificial neural networks have gone through a recent rise in popularity, achieving state-of-the-art results in various fields, including image classification, speech recognition, and automated control. Both the performance and…

Neural and Evolutionary Computing · Computer Science 2016-11-08 Sean C. Smithson , Guang Yang , Warren J. Gross , Brett H. Meyer

Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps

We propose a new algorithm to learn a one-hidden-layer convolutional neural network where both the convolutional weights and the outputs weights are parameters to be learned. Our algorithm works for a general class of (potentially…

Machine Learning · Computer Science 2018-06-05 Simon S. Du , Surbhi Goel

Less is More: Selective Layer Finetuning with SubTuning

Finetuning a pretrained model has become a standard approach for training neural networks on novel tasks, resulting in fast convergence and improved performance. In this work, we study an alternative finetuning method, where instead of…

Machine Learning · Computer Science 2023-07-04 Gal Kaplun , Andrey Gurevich , Tal Swisa , Mazor David , Shai Shalev-Shwartz , Eran Malach

Why Line Search when you can Plane Search? SO-Friendly Neural Networks allow Per-Iteration Optimization of Learning and Momentum Rates for Every Layer

We introduce the class of SO-friendly neural networks, which include several models used in practice including networks with 2 layers of hidden weights where the number of inputs is larger than the number of outputs. SO-friendly networks…

Machine Learning · Computer Science 2024-06-27 Betty Shea , Mark Schmidt

Efficient Neural Network Training via Subset Pretraining

In training neural networks, it is common practice to use partial gradients computed over batches, mostly very small subsets of the training set. This approach is motivated by the argument that such a partial gradient is close to the true…

Machine Learning · Computer Science 2024-11-25 Jan Spörer , Bernhard Bermeitinger , Tomas Hrycej , Niklas Limacher , Siegfried Handschuh

Large Scale Structure of Neural Network Loss Landscapes

There are many surprising and perhaps counter-intuitive properties of optimization of deep neural networks. We propose and experimentally verify a unified phenomenological model of the loss landscape that incorporates many of them. High…

Machine Learning · Computer Science 2019-06-12 Stanislav Fort , Stanislaw Jastrzebski

Traversing Between Modes in Function Space for Fast Ensembling

Deep ensemble is a simple yet powerful way to improve the performance of deep neural networks. Under this motivation, recent works on mode connectivity have shown that parameters of ensembles are connected by low-loss subspaces, and one can…

Machine Learning · Computer Science 2023-06-21 EungGu Yun , Hyungi Lee , Giung Nam , Juho Lee

SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space

Designing neural networks typically relies on manual trial and error or a neural architecture search (NAS) followed by weight training. The former is time-consuming and labor-intensive, while the latter often discretizes architecture search…

Machine Learning · Computer Science 2025-11-19 Zitong Huang , Mansooreh Montazerin , Ajitesh Srivastava

Neural Networks with Few Multiplications

For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates…

Machine Learning · Computer Science 2016-02-29 Zhouhan Lin , Matthieu Courbariaux , Roland Memisevic , Yoshua Bengio

Learning Invariant Weights in Neural Networks

Assumptions about invariances or symmetries in data can significantly increase the predictive power of statistical models. Many commonly used models in machine learning are constraint to respect certain symmetries in the data, such as…

Machine Learning · Statistics 2022-08-03 Tycho F. A. van der Ouderaa , Mark van der Wilk

The Difficulty of Training Sparse Neural Networks

We investigate the difficulties of training sparse neural networks and make new observations about optimization dynamics and the energy landscape within the sparse regime. Recent work of \citep{Gale2019, Liu2018} has shown that sparse…

Machine Learning · Computer Science 2020-10-08 Utku Evci , Fabian Pedregosa , Aidan Gomez , Erich Elsen

Efficient and Sparse Neural Networks by Pruning Weights in a Multiobjective Learning Approach

Overparameterization and overfitting are common concerns when designing and training deep neural networks, that are often counteracted by pruning and regularization strategies. However, these strategies remain secondary to most learning…

Machine Learning · Computer Science 2020-09-01 Malena Reiners , Kathrin Klamroth , Michael Stiglmayr

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths

Understanding the optimization dynamics of neural networks is necessary for closing the gap between theory and practice. Stochastic first-order optimization algorithms are known to efficiently locate favorable minima in deep neural…

Machine Learning · Computer Science 2023-06-22 Charles Guille-Escuret , Hiroki Naganuma , Kilian Fatras , Ioannis Mitliagkas