Related papers: Multi-layer Perceptron Trainability Explained via …

Neural Tangent Kernel Analysis of Deep Narrow Neural Networks

The tremendous recent progress in analyzing the training dynamics of overparameterized neural networks has primarily focused on wide networks and therefore does not sufficiently address the role of depth in deep learning. In this work, we…

Machine Learning · Computer Science 2022-06-29 Jongmin Lee , Joo Young Choi , Ernest K. Ryu , Albert No

Exploring layerwise decision making in DNNs

While deep neural networks (DNNs) have become a standard architecture for many machine learning tasks, their internal decision-making process and general interpretability is still poorly understood. Conversely, common decision trees are…

Machine Learning · Computer Science 2022-02-02 Coenraad Mouton , Marelie H. Davel

Variational Tensor Neural Networks for Deep Learning

Deep neural networks (NNs) encounter scalability limitations when confronted with a vast array of neurons, thereby constraining their achievable network depth. To address this challenge, we propose an integration of tensor networks (TN)…

Disordered Systems and Neural Networks · Physics 2024-08-20 Saeed S. Jahromi , Roman Orus

Analysis of Invariance and Robustness via Invertibility of ReLU-Networks

Studying the invertibility of deep neural networks (DNNs) provides a principled approach to better understand the behavior of these powerful models. Despite being a promising diagnostic tool, a consistent theory on their invertibility is…

Machine Learning · Computer Science 2018-06-28 Jens Behrmann , Sören Dittmer , Pascal Fernsel , Peter Maaß

Pruned Neural Networks are Surprisingly Modular

The learned weights of a neural network are often considered devoid of scrutable internal structure. To discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate…

Neural and Evolutionary Computing · Computer Science 2022-02-09 Daniel Filan , Shlomi Hod , Cody Wild , Andrew Critch , Stuart Russell

Emergent Low-Rank Training Dynamics in MLPs with Smooth Activations

Recent empirical evidence has demonstrated that the training dynamics of large-scale deep neural networks occur within low-dimensional subspaces. While this has inspired new research into low-rank training, compression, and adaptation,…

Machine Learning · Computer Science 2026-02-09 Alec S. Xu , Can Yaras , Matthew Asato , Qing Qu , Laura Balzano

On Learnable Parameters of Optimal and Suboptimal Deep Learning Models

We scrutinize the structural and operational aspects of deep learning models, particularly focusing on the nuances of learnable parameters (weight) statistics, distribution, node interaction, and visualization. By establishing correlations…

Machine Learning · Computer Science 2024-08-22 Ziwei Zheng , Huizhi Liang , Vaclav Snasel , Vito Latora , Panos Pardalos , Giuseppe Nicosia , Varun Ojha

Plastic Learning with Deep Fourier Features

Deep neural networks can struggle to learn continually in the face of non-stationarity. This phenomenon is known as loss of plasticity. In this paper, we identify underlying principles that lead to plastic algorithms. In particular, we…

Machine Learning · Computer Science 2024-10-29 Alex Lewandowski , Dale Schuurmans , Marlos C. Machado

The Mechanical Neural Network(MNN) -- A physical implementation of a multilayer perceptron for education and hands-on experimentation

In this paper the Mechanical Neural Network(MNN) is introduced, a physical implementation of a multilayer perceptron(MLP) with ReLU activation functions, two input neurons, four hidden neurons and two output neurons. This physical model of…

Machine Learning · Computer Science 2023-11-09 Axel Schaffland

Rethinking the Relationship between Recurrent and Non-Recurrent Neural Networks: A Study in Sparsity

Neural networks (NN) can be divided into two broad categories, recurrent and non-recurrent. Both types of neural networks are popular and extensively studied, but they are often treated as distinct families of machine learning algorithms.…

Machine Learning · Computer Science 2024-04-02 Quincy Hershey , Randy Paffenroth , Harsh Pathak , Simon Tavener

Rethinking Two Consensuses of the Transferability in Deep Learning

Deep transfer learning (DTL) has formed a long-term quest toward enabling deep neural networks (DNNs) to reuse historical experiences as efficiently as humans. This ability is named knowledge transferability. A commonly used paradigm for…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Yixiong Chen , Jingxian Li , Chris Ding , Li Liu

Understanding plasticity in neural networks

Plasticity, the ability of a neural network to quickly change its predictions in response to new information, is essential for the adaptability and robustness of deep reinforcement learning systems. Deep neural networks are known to lose…

Machine Learning · Computer Science 2023-11-28 Clare Lyle , Zeyu Zheng , Evgenii Nikishin , Bernardo Avila Pires , Razvan Pascanu , Will Dabney

Robust Nonparametric Hypothesis Testing to Understand Variability in Training Neural Networks

Training a deep neural network (DNN) often involves stochastic optimization, which means each run will produce a different model. Several works suggest this variability is negligible when models have the same performance, which in the case…

Machine Learning · Statistics 2023-10-03 Sinjini Banerjee , Reilly Cannon , Tim Marrinan , Tony Chiang , Anand D. Sarwate

A simple theory for training response of deep neural networks

Deep neural networks give us a powerful method to model the training dataset's relationship between input and output. We can regard that as a complex adaptive system consisting of many artificial neurons that work as an adaptive memory as a…

Disordered Systems and Neural Networks · Physics 2024-05-08 Kenichi Nakazato

Predicting Plasticity in Deep Continual Learning: A Theoretical Perspective

Deep continual learning requires models to adapt to new tasks without retraining from scratch. However, neural networks can lose their ability to adapt to new tasks after training on previous ones, a phenomenon known as loss of plasticity.…

Machine Learning · Computer Science 2026-05-12 Jiuqi Wang , Jayanth Srinivasa , Claire Chen , Shuze Daniel Liu , Ali Payani , Shangtong Zhang

Scaling MLPs: A Tale of Inductive Bias

In this work we revisit the most fundamental building block in deep learning, the multi-layer perceptron (MLP), and study the limits of its performance on vision tasks. Empirical insights into MLPs are important for multiple reasons. (1)…

Machine Learning · Computer Science 2023-10-04 Gregor Bachmann , Sotiris Anagnostidis , Thomas Hofmann

Doing the impossible: Why neural networks can be trained at all

As deep neural networks grow in size, from thousands to millions to billions of weights, the performance of those networks becomes limited by our ability to accurately train them. A common naive question arises: if we have a system with…

Machine Learning · Computer Science 2018-05-29 Nathan O. Hodas , Panos Stinis

Neuronal Fluctuations: Learning Rates vs Participating Neurons

Deep Neural Networks (DNNs) rely on inherent fluctuations in their internal parameters (weights and biases) to effectively navigate the complex optimization landscape and achieve robust performance. While these fluctuations are recognized…

Machine Learning · Computer Science 2025-11-14 Darsh Pareek , Umesh Kumar , Ruthu Rao , Ravi Janjam

Learning Neural Network Classifiers with Low Model Complexity

Modern neural network architectures for large-scale learning tasks have substantially higher model complexities, which makes understanding, visualizing and training these architectures difficult. Recent contributions to deep learning…

Machine Learning · Computer Science 2024-10-30 Jayadeva , Himanshu Pant , Mayank Sharma , Abhimanyu Dubey , Sumit Soman , Suraj Tripathi , Sai Guruju , Nihal Goalla

Gradient-Free Neural Network Training via Synaptic-Level Reinforcement Learning

An ongoing challenge in neural information processing is: how do neurons adjust their connectivity to improve task performance over time (i.e., actualize learning)? It is widely believed that there is a consistent, synaptic-level learning…

Neural and Evolutionary Computing · Computer Science 2021-06-01 Aman Bhargava , Mohammad R. Rezaei , Milad Lankarany