Related papers: Backward Feature Correction: How Deep Learning Per…

Deep supervised learning using local errors

Error backpropagation is a highly effective mechanism for learning high-quality hierarchical features in deep networks. Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from…

Neural and Evolutionary Computing · Computer Science 2017-11-21 Hesham Mostafa , Vishwajith Ramesh , Gert Cauwenberghs

The Computational Advantage of Depth: Learning High-Dimensional Hierarchical Functions with Gradient Descent

Understanding the advantages of deep neural networks trained by gradient descent (GD) compared to shallow models remains an open theoretical challenge. In this paper, we introduce a class of target functions (single and multi-index Gaussian…

Machine Learning · Statistics 2025-11-17 Yatin Dandi , Luca Pesce , Lenka Zdeborová , Florent Krzakala

How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

Deep learning algorithms demonstrate a surprising ability to learn high-dimensional tasks from limited examples. This is commonly attributed to the depth of neural networks, enabling them to build a hierarchy of abstract, low-dimensional…

Machine Learning · Computer Science 2024-07-04 Francesco Cagnetta , Leonardo Petrini , Umberto M. Tomasini , Alessandro Favero , Matthieu Wyart

Layer-wise training of deep networks using kernel similarity

Deep learning has shown promising results in many machine learning applications. The hierarchical feature representation built by deep networks enable compact and precise encoding of the data. A kernel analysis of the trained deep networks…

Machine Learning · Computer Science 2017-03-22 Mandar Kulkarni , Shirish Karande

Training Neural Networks Using Features Replay

Training a neural network using backpropagation algorithm requires passing error gradients sequentially through the network. The backward locking prevents us from updating network layers in parallel and fully leveraging the computing…

Machine Learning · Computer Science 2019-05-30 Zhouyuan Huo , Bin Gu , Heng Huang

Convergent Learning: Do different neural networks learn the same representations?

Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by…

Machine Learning · Computer Science 2016-03-01 Yixuan Li , Jason Yosinski , Jeff Clune , Hod Lipson , John Hopcroft

Deep Networks Learn Deep Hierarchical Models

We consider supervised learning with $n$ labels and show that layerwise SGD on residual networks can efficiently learn a class of hierarchical models. This model class assumes the existence of an (unknown) label hierarchy $L_1 \subseteq L_2…

Machine Learning · Computer Science 2026-01-05 Amit Daniely

Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning

Understanding how deep neural networks learn useful internal representations from data remains a central open problem in the theory of deep learning. We introduce Neural Low-Degree Filtering (Neural LoFi), a stylized limit of gradient-based…

Machine Learning · Computer Science 2026-05-14 Yatin Dandi , Matteo Vilucchio , Luca Arnaboldi , Hugo Tabanelli , Florent Krzakala

Provable Learning of Random Hierarchy Models and Hierarchical Shallow-to-Deep Chaining

The empirical success of deep learning is often attributed to deep networks' ability to exploit hierarchical structure in data, constructing increasingly complex features across layers. Yet despite substantial progress in deep learning…

Machine Learning · Computer Science 2026-01-28 Yunwei Ren , Yatin Dandi , Florent Krzakala , Jason D. Lee

Why does Deep Learning work? - A perspective from Group Theory

Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning.…

Machine Learning · Computer Science 2015-03-03 Arnab Paul , Suresh Venkatasubramanian

Training Large Neural Networks With Low-Dimensional Error Feedback

Training deep neural networks typically relies on backpropagating high dimensional error signals a computationally intensive process with little evidence supporting its implementation in the brain. However, since most tasks involve…

Machine Learning · Computer Science 2026-01-15 Maher Hanut , Jonathan Kadmon

Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

Over the past decade, deep learning has proven to be a highly effective tool for learning meaningful features from raw data. However, it remains an open question how deep networks perform hierarchical feature learning across layers. In this…

Machine Learning · Computer Science 2025-11-17 Peng Wang , Xiao Li , Can Yaras , Zhihui Zhu , Laura Balzano , Wei Hu , Qing Qu

Towards Understanding Hierarchical Learning: Benefits of Neural Representations

Deep neural networks can empirically perform efficient hierarchical learning, in which the layers learn useful representations of the data. However, how they make use of the intermediate representations are not explained by recent theories…

Machine Learning · Computer Science 2021-03-08 Minshuo Chen , Yu Bai , Jason D. Lee , Tuo Zhao , Huan Wang , Caiming Xiong , Richard Socher

Deep Decomposition Learning for Inverse Imaging Problems

Deep learning is emerging as a new paradigm for solving inverse imaging problems. However, the deep learning methods often lack the assurance of traditional physics-based methods due to the lack of physical information considerations in…

Image and Video Processing · Electrical Eng. & Systems 2020-07-20 Dongdong Chen , Mike E. Davies

Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers

In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks.…

Machine Learning · Computer Science 2023-11-21 Andrea Apicella , Francesco Isgrò , Roberto Prevete

Learning Hierarchical Polynomials of Multiple Nonlinear Features with Three-Layer Networks

In deep learning theory, a critical question is to understand how neural networks learn hierarchical features. In this work, we study the learning of hierarchical polynomials of \textit{multiple nonlinear features} using three-layer neural…

Machine Learning · Computer Science 2024-11-27 Hengyu Fu , Zihao Wang , Eshaan Nichani , Jason D. Lee

Face Recognition System

Deep learning is one of the new and important branches in machine learning. Deep learning refers to a set of algorithms that solve various problems such as images and texts by using various machine learning algorithms in multi-layer neural…

Computer Vision and Pattern Recognition · Computer Science 2019-01-10 Yang Li , Sangwhan Cha

Training Neural Networks with Local Error Signals

Supervised training of neural networks for classification is typically performed with a global loss function. The loss function provides a gradient for the output layer, and this gradient is back-propagated to hidden layers to dictate an…

Machine Learning · Statistics 2019-05-09 Arild Nøkland , Lars Hiller Eidnes

Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks

One of the central questions in the theory of deep learning is to understand how neural networks learn hierarchical features. The ability of deep networks to extract salient features is crucial to both their outstanding generalization…

Machine Learning · Computer Science 2025-04-03 Eshaan Nichani , Alex Damian , Jason D. Lee

Deep Dictionary Learning: A PARametric NETwork Approach

Deep dictionary learning seeks multiple dictionaries at different image scales to capture complementary coherent characteristics. We propose a method for learning a hierarchy of synthesis dictionaries with an image classification goal. The…

Computer Vision and Pattern Recognition · Computer Science 2019-09-04 Shahin Mahdizadehaghdam , Ashkan Panahi , Hamid Krim , Liyi Dai