Related papers: Multi-layer Perceptron Trainability Explained via …
The tremendous recent progress in analyzing the training dynamics of overparameterized neural networks has primarily focused on wide networks and therefore does not sufficiently address the role of depth in deep learning. In this work, we…
While deep neural networks (DNNs) have become a standard architecture for many machine learning tasks, their internal decision-making process and general interpretability is still poorly understood. Conversely, common decision trees are…
Deep neural networks (NNs) encounter scalability limitations when confronted with a vast array of neurons, thereby constraining their achievable network depth. To address this challenge, we propose an integration of tensor networks (TN)…
Studying the invertibility of deep neural networks (DNNs) provides a principled approach to better understand the behavior of these powerful models. Despite being a promising diagnostic tool, a consistent theory on their invertibility is…
The learned weights of a neural network are often considered devoid of scrutable internal structure. To discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate…
Recent empirical evidence has demonstrated that the training dynamics of large-scale deep neural networks occur within low-dimensional subspaces. While this has inspired new research into low-rank training, compression, and adaptation,…
We scrutinize the structural and operational aspects of deep learning models, particularly focusing on the nuances of learnable parameters (weight) statistics, distribution, node interaction, and visualization. By establishing correlations…
Deep neural networks can struggle to learn continually in the face of non-stationarity. This phenomenon is known as loss of plasticity. In this paper, we identify underlying principles that lead to plastic algorithms. In particular, we…
In this paper the Mechanical Neural Network(MNN) is introduced, a physical implementation of a multilayer perceptron(MLP) with ReLU activation functions, two input neurons, four hidden neurons and two output neurons. This physical model of…
Neural networks (NN) can be divided into two broad categories, recurrent and non-recurrent. Both types of neural networks are popular and extensively studied, but they are often treated as distinct families of machine learning algorithms.…
Deep transfer learning (DTL) has formed a long-term quest toward enabling deep neural networks (DNNs) to reuse historical experiences as efficiently as humans. This ability is named knowledge transferability. A commonly used paradigm for…
Plasticity, the ability of a neural network to quickly change its predictions in response to new information, is essential for the adaptability and robustness of deep reinforcement learning systems. Deep neural networks are known to lose…
Training a deep neural network (DNN) often involves stochastic optimization, which means each run will produce a different model. Several works suggest this variability is negligible when models have the same performance, which in the case…
Deep neural networks give us a powerful method to model the training dataset's relationship between input and output. We can regard that as a complex adaptive system consisting of many artificial neurons that work as an adaptive memory as a…
Deep continual learning requires models to adapt to new tasks without retraining from scratch. However, neural networks can lose their ability to adapt to new tasks after training on previous ones, a phenomenon known as loss of plasticity.…
In this work we revisit the most fundamental building block in deep learning, the multi-layer perceptron (MLP), and study the limits of its performance on vision tasks. Empirical insights into MLPs are important for multiple reasons. (1)…
As deep neural networks grow in size, from thousands to millions to billions of weights, the performance of those networks becomes limited by our ability to accurately train them. A common naive question arises: if we have a system with…
Deep Neural Networks (DNNs) rely on inherent fluctuations in their internal parameters (weights and biases) to effectively navigate the complex optimization landscape and achieve robust performance. While these fluctuations are recognized…
Modern neural network architectures for large-scale learning tasks have substantially higher model complexities, which makes understanding, visualizing and training these architectures difficult. Recent contributions to deep learning…
An ongoing challenge in neural information processing is: how do neurons adjust their connectivity to improve task performance over time (i.e., actualize learning)? It is widely believed that there is a consistent, synaptic-level learning…